Makes sense. I suspect it may even worsen the problem I observed
with the mpage code. Set the readahead to 256k with `blockdev --setra 512'
and then run tiobench. The read latencies are massive - one thread
gets hold of the disk head and hogs it for 30-60 seconds.
The readahead code has a sort of double-window design. The idea is that
if the disk does 50 megs/sec and your application processes data at
49 megs/sec, the application will never block on I/O. At 256k readahead,
the readahead code will be laying out four BIOs at a time. It's probable
that the application is actually submitting BIOs for a new readahead
window before all of the BIOs for the old one are complete. So it's performing
merging against its own reads.
Given all this, what I would expect to see is for thread "A" to capture
the disk head for some period of time, until eventually one of thread "B"'s
requests expires its latency. Then thread "B" gets to hog the disk head.
That's reasonable behaviour, but the latencies are *enormous*. Almost
like the latency stuff isn't working. But it sure looks OK.
Not super-high priority at this time. I'll play with it some more.
(Some userspace tunables for the elevator would be nice. Hint. ;))
hmm. Actually the code looks a bit odd:
if (elv_linus_sequence(__rq)-- <= 0)
break;
if (!(__rq->flags & REQ_CMD))
continue;
if (elv_linus_sequence(__rq) < bio_sectors(bio))
break;
The first decrement is saying that elv_linus_sequence is in units of
requests, but the comparison (and the later `-= bio_sectors()') seems
to be saying it's in units of sectors.
I think calculating the latency in terms of requests makes more sense - just
ignore the actual size of those requests (or weight it down in some manner).
But I don't immediately see what the above code is up to?
-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/