> The write clustering issue has already been discussed (mainly at Miami)
> and the agreement, AFAIK, was to implement the write clustering at the
> per-address-space writepage() operation.
>
> IMO there are some problems if we implement the write clustering in this
> level:
>
> - The filesystem does not have information (and should not have) about
> limiting cluster size depending on memory shortage.
Is there ever a reason NOT to do the best possible IO
clustering at write time ?
Remember that disk writes do not cost memory and have
no influence on the resident set ... completely unlike
read clustering, which does need to be limited.
> - By doing the write clustering at a higher level, we avoid a ton of
> filesystems duplicating the code.
>
> So what I suggest is to add a "cluster" operation to struct address_space
> which can be used by the VM code to know the optimal IO transfer unit in
> the storage device. Something like this (maybe we need an async flag but
> thats a minor detail now):
>
> int (*cluster)(struct page *, unsigned long *boffset,
> unsigned long *poffset);
Makes sense, except that I don't see how (or why) the _VM_
should "know the optimal IO transfer unit". This sounds more
like a job for the IO subsystem and/or the filesystem, IMHO.
> "page" is from where the filesystem code should start its search
> for contiguous pages. boffset and poffset are passed by the VM
> code to know the logical "backwards offset" (number of
> contiguous pages going backwards from "page") and "forward
> offset" (cont pages going forward from "page") in the inode.
Yes, this makes a LOT of sense. I really like a pagecache
helper function so the filesystems can build their writeout
clusters easier.
regards,
Rik
-- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose...http://www.surriel.com/ http://www.conectiva.com/ http://distro.conectiva.com.br/
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/