Re: 2.4.19pre5aa1 and splitted vm-33

Andrea Arcangeli (andrea@suse.de)
Mon, 1 Apr 2002 02:04:17 +0200


On Sun, Mar 31, 2002 at 08:00:38PM +0100, Christoph Hellwig wrote:
> On Sun, Mar 31, 2002 at 08:38:30PM +0200, Andrea Arcangeli wrote:
> > Hmm that's a problem. BTW, is there any valid reason xfs isn't
> > implementing O_DIRECT via the direct_IO address space operation like the
> > other filesystems ext2/reiserfs/nfs? (of course I'm talking long term, I
> > don't pretend to change that in one day, for the short term we should
> > still allow xfs to use its own internal methods of doing O_DIRECT)
>
> For definitve answers you have to ask Steve or some else from the XFS crow,
> but if you look at the XFS data I/O path is is completly different from
> the generic Linux filesystems due to the IRIX history/compatiblity and the
> 'need' for features not present in the core kernel. Before 2.4.10 one
> of those was O_DIRECT, btw..

Yep I understand that, it was more a request for comments on the API, to
learn if they've any design limitation with that generic API, I know it
would be tricky to convert xfs, not something I'd expect in a minor
release, I was talking a bit longer term, not necessairly for 2.4. XFS
doesn't journal data so my point was it should be just fine in theory
with the current aops direct_IO API.

For 2.5 we also need a generic library function that puts anchors into
the pagecache hash too, simulating locked pages, to serialize the I/O on
the data too during O_DIRECT to allow cleanly O_DIRECT with _data_
journaling.

> > Fixing that should be a one liner setting a_ops->direct_IO to
> > ERR_PTR(-1L) within xfs. Comments?
>
> Looks good to me.

Ok.

> > > For the open case putting the check into generic_file_open sounds good to me,
> >
> > One problem with generic_file_open is that we'd need to rely on all
> > lowlevel open callbacks to do the check properly if they don't support
> > O_DIRECT and we'd need to duplicate code, only a few fs uses
> > generic_file_open. Lefting it in the vfs looked cleaner, it reduces the
> > check to a few liner patch.
> >
> > The lowlevel callback can always drop the kiovec during its own ->open
> > if they don't need it, even if they uses direct_IO, like nfs.
>
> I don't think this is a good design decision, but I can live with it for
> 2.4 - for 2.5 I'm still hoping for bcrls kvec to replace the current kiobuf
> stuff and we have to see what that means for O_DIRECT.

Avoiding code duplication and taking the API simpler is the good part,
but I see nfs would be penalized (first it allocates and the
deallocates). I agree it's not the best design to get the maximal
performance with nfs and some other, it was mostly a 2.4 thing infact,
as you say with 2.5 if the kiobuf overhead is dropped then we may not
need to preallocate the kiobuf in the first place, then this whole
discussion about "where to preallocate" will be pointless. Also the
kiobuf-slab patch should just make it significantly faster compared to
the current prohibitive vmalloc.

> > > kiobufs even if XFS doesn't need them..
> >
> > Chuck's patch is handling fcntl as well, do you see any problem there?
> > (modulo the fact the kiovec is pre-allocated even if not necessary, like
> > in open(2)?)
>
> It's this same issue, there is no other problem in my eyes.

Ok.

Thanks,

Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/