Re: [RFC] generic IO write clustering

Rik van Riel (riel@conectiva.com.br)
Sat, 20 Jan 2001 13:58:38 +1100 (EST)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Arnaldo Carvalho de Melo: "Re: -ac10 compile error"
Previous message: Leslie Donaldson: "Re: Patch for aic7xxx 2.4.0 test12 hang"

On Fri, 19 Jan 2001, Marcelo Tosatti wrote:

> The write clustering issue has already been discussed (mainly at Miami)
> and the agreement, AFAIK, was to implement the write clustering at the
> per-address-space writepage() operation.
>
> IMO there are some problems if we implement the write clustering in this
> level:
>
> - The filesystem does not have information (and should not have) about
> limiting cluster size depending on memory shortage.

Is there ever a reason NOT to do the best possible IO
clustering at write time ?

Remember that disk writes do not cost memory and have
no influence on the resident set ... completely unlike
read clustering, which does need to be limited.

> - By doing the write clustering at a higher level, we avoid a ton of
> filesystems duplicating the code.
>
> So what I suggest is to add a "cluster" operation to struct address_space
> which can be used by the VM code to know the optimal IO transfer unit in
> the storage device. Something like this (maybe we need an async flag but
> thats a minor detail now):
>
> int (*cluster)(struct page *, unsigned long *boffset,
> unsigned long *poffset);

Makes sense, except that I don't see how (or why) the _VM_
should "know the optimal IO transfer unit". This sounds more
like a job for the IO subsystem and/or the filesystem, IMHO.

> "page" is from where the filesystem code should start its search
> for contiguous pages. boffset and poffset are passed by the VM
> code to know the logical "backwards offset" (number of
> contiguous pages going backwards from "page") and "forward
> offset" (cont pages going forward from "page") in the inode.

Yes, this makes a LOT of sense. I really like a pagecache
helper function so the filesystems can build their writeout
clusters easier.

regards,

Rik

-- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose...

http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/

Next message: Arnaldo Carvalho de Melo: "Re: -ac10 compile error"
Previous message: Leslie Donaldson: "Re: Patch for aic7xxx 2.4.0 test12 hang"