Sure, but even though the batch wakeup code didn't have starvation
issues, the overall get_request latency was still high. The end result
was basically the same, without q->full we've got a higher max wait and
a lower average wait. With batch wakeup we've got a higher average
(300-400 jiffies) and a lower max (800-900 jiffies).
Especially for things like directory listings, where 2.4 generally does
io a few blocks at a time, the get_request latency is a big part of the
latency an interactive user sees.
> >So, the way I see things, we've got a few choices.
> >
> >1) do nothing. 2.6 isn't that far off.
> >
> >2) add elevator-lowlatency without q->full. It solves 90% of the
> >problem
> >
> >3) add q->full as well and make it the default. Great latencies, not so
> >good throughput. Add userland tunables so people can switch.
> >
> >4) back port some larger chunk of 2.5 and find a better overall
> >solution.
> >
> >I vote for #3, don't care much if q->full is on or off by default, as
> >long as we make an easy way for people to set it.
> >
>
> 5) include the "q->full" starvation fix; add the concept of a
> queue owner, the batching process.
>
I've tried two different approaches to #5, the first is a just a
batch_owner where other procs are still allowed to grab requests and the
owner was allowed to ignore q->full. The end result was low latencies
but not much better throughput. With a small number of procs, you've
got a good chance bdflush is going to get ownership and the throughput
is pretty good. With more procs the probability of that goes down and
the throughput benefit goes away.
My second attempt was the batch wakeup patch from yesterday. Overall I
don't feel the latencies are significantly better with that patch than
with Andrea's elevator-lowlatency and q->full disabled.
> I'm a bit busy at the moment and so I won't test this, unfortunately.
> I would prefer that if something like #5 doesn't get in, then nothing
> be done for .22 unless its backed up by a few decent benchmarks. But
> its not my call anyway.
>
Andrea's code without q->full is a good starting point regardless. The
throughput is good and the latencies are better overall. q->full is
simple enough that making it available via a tunable is pretty easy. I
really do wish I could make one patch that works well for both, but I've
honestly run out of ideas ;-)
-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/