Nice shooting, this could explain the effect I noticed where
writing a linker file takes 8 times longer when competing with
a simultaneous grep.
> There are two possible solutions to the starvation scenario:
>
> 1) have one threshold
> 2) if one task is sleeping, let ALL tasks sleep
> until we reach the lower threshold
Umm.... Hmm, there are lots more solutions than that, but those two
are nice and simple. A quick test for (1) I hope Ben will try is
just to set high_queued_sectors = low_queued_sectors.
Currently, IO scheduling relies on the "random" algorithm for fairness
where the randomness is supplied by the processes. This breaks down
sometimes, spectacularly, for some distinctly non-random access
patterns as you demonstrated.
Algorithm (2) above would have some potentially strange interactions
with the scheduler, it looks scary. (E.g., change the scheduler, IO
on some people's machines suddenly goes to hell.)
Come to think of it (1) will also suffer in some cases from nonrandom
scheduling.
Now let me see, why do we even have the high+low thresholds? I
suppose it is to avoid taking two context switches on every submitted
block, so it seems like a good idea.
For IO fairness I think we need something a little more deterministic.
I'm thinking about an IO quantum right now - when a task has used up
its quantum it yields to the next task, if any, waiting on the IO
queue. How to preserve the effect of the high+low thresholds... it
needs more thinking, though I've already thought of several ways of
doing it badly :-)
-- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/