Re: Bio pool & scsi scatter gather pool usage

David Mansfield (david@cobite.com)
Thu, 25 Apr 2002 15:59:43 -0400 (EDT)


> > > > > In EVMS, we are adding code to deal with BIO splitting, to
> > > > > enable our feature modules, such as DriveLinking, LVM, & MD
> > > > > Linear, etc to break large BIOs up on chunk size or lower
> > > > > level device boundaries.
> > > >
> > > > Could I suggest that this code not be part of EVMS, but that
> > > > you implement it as a library within the core kernel? Lots of
> > > > stuff is going to need BIO splitting - software RAID, ataraid,
> > > > XFS, etc. May as well talk with Jens, Martin Petersen, Arjan,
> > > > Neil Brown. Do it once, do it right...
> > > >
> > > I take that back.
> > >
> > > We really, really do not want to perform BIO splitting at all.
> > > It requires that the kernel perform GFP_NOIO allocations at
> > > the worst possible time, and it's just broken.
> > >
> > > What I would much prefer is that the top-level BIO assembly
> > > code be able to find out, beforehand, what the maximum
> > > permissible BIO size is at the chosen offset. It can then
> > > simple restrict the BIO to that size.
> > >

[snipped some ideas]

>
> Why not just put the smallest required BIO size in a struct for that device?
> Then each read of that struct can be kept in cache...
>
> Is the BIO max size going to change at different offsets?

My two cents as a non-guru: there are two different reasons for splitting
a large BIO:

1) some layer has hit some uncommon boundary condition, like spanning
linearly appended physical volumes in an LV or something like that

2) a fundamental 'maximum-chunkiness' allowed by some layer has been
exceeded, like stripe size in a raid, or MAX_SECTORS in ide or something
like that.

It would suck if the system generated large BIOS that needed to be split
for every IO operation, due to #2, but it would also suck to add overhead
to every IO operation for #1.

#1 is an exception, and I think it would be acceptible to have a splitting
function/mempool for handling what should be a boundary condition only,
and the concept of a call through the layers to find out #2 at open time
would handle #2 one time per device or something like that.

David

-- 
/==============================\
| David Mansfield              |
| david@cobite.com             |
\==============================/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/