Re: performance of O_DIRECT on md/lvm

Andrea Arcangeli (andrea@suse.de)
Mon, 21 Jan 2002 02:12:24 +0100


On Sun, Jan 20, 2002 at 10:28:21PM +0100, Andi Kleen wrote:
> Andrea Arcangeli <andrea@suse.de> writes:
> >
> > if you read in chunks of a few mbytes per read syscall, the lack of
> > readahead shouldn't make much difference (this is true for both raid and
> > standalone device). If there's a relevant difference it's more liekly an
> > issue with the blocksize.
>
> The problem with that is that doing overlapping IO requires much more
> effort (you need threads in user space). If you don't do overlapping
> IO you add a latency bubble for each round trip to user space after you
> read one big chunk and submitting the request for the next big chunk.
> Your disk will not be constantly streaming, because of these pauses where
> it doesn't have an request to process.

correct, we can't keep the pipeline always full, the larger the size of
the read/write, the lower it will matter, this is the only way to hide
the pipeline stall at the moment (like with rawio).

> The application could do it using some aio setup, but it gets rather
> complicated and the kernel already knows how to do that well.

yes, in short the API to allow the userspace to keep the I/O pipeline
full with a ring of user buffers is not available at the moment.

As you say one could try to workaround it by threading the I/O in
userspace but it would get rather dirty (and with a scheduling
overhead).

>
> I think an optional readahead mode for O_DIRECT would be useful.

to do transparent readahead we'll need to use the pagecache, so we'd need
to make copies of pages with the cpu between usermemory and pagecache,
but the nicer part of O_DIRECT is that it skips the costly copies
with the cpu on the membus, so I usually disagree about trying to allow
O_DIRECT to support readahead. I believe if you need readahead, you
probably shouldn't use O_DIRECT in the first place.

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/