Re: [PATCH] 2.4.18 scheduler bugs

Rob Landley (landley@trommello.org)
Mon, 18 Mar 2002 21:08:16 -0500


On Friday 15 March 2002 04:39 pm, Alan Cox wrote:
> > - ksoftirqd() - change daemon nice(2) value from 19 to -19.
> >
> > SoftIRQ servicing was less important than the most lowly of batch
> > tasks. This patch makes it more important than all but the realtime
> > tasks.
>
> Bad idea - the right fix to this is to stop using ksoftirqd so readily
> under load. If it bales after 20 iterations life is good. As shipped life
> is bad.
>
> Once ksoftirq triggers its because we are seriously overloaded (or without
> fixing its use slightly randomly). In that case we want other stuff to
> do work before we potentially unleash the next flood.

Also, I'm not sure if this is still the case but in earlier versions of the
O(1) scheduler, nice(19) tasks tended to get their entire timeslice in one
big long uninterrupted gulp. (By the time they got a slice, everything with
a higher priority has already run.) This was great for long cpu-intensive
loads because it meant your cache stayed hot, so you wound up churning
through your CPU-bound load even faster than if you got little snippets of
time and had to keep reloading your cache. (Great for interactive latency,
sucks for number crunching.)

Higher priority tasks get first crack at the CPU, but that doesn't mean they
get to keep it for long. High priority is for latency reasons, not for
throughput reasons. If your router spills over to ksoftirqd, you're already
beyond optimizing for latency. If ksoftirqd is doing ping-pong scheduling
with cron, overall throughput is worse.

Rob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/