No, you can actually do all the "prepare_write()"/"commit_write()" stuff
that the filesystems already do. And you can do it a lot _better_ than the
current buffer-cache-based approach. Done right, you can actually do all
IO in page-sized chunks, BUT fall down on sector-sized things for the
cases where you want to.
This is exactly the same issue that filesystems had with writers of less
than a page - and the page cache interfaces allow for byte-granular writes
(as actually shown by things like NFS, which do exactly that. For a block
device, the granularity obviously tends to be at least 512 bytes).
> Of course, we could just say "then don't do that" and be done with it
> --- after all, we already have this behaviour when writing to regular
> files.
No, we really don't. When you write an aligned 1kB block to a 1kB ext2
filesystem, it will _not_ do a page-sized read-modify-write. It will just
create the proper 1kB buffers, and mark one of the dirty.
Now, admittedly it is _easier_ to just always consider things 4kB in
size. And faster too, for the common cases. So it might not be worth it to
do the extra work unless somebody can show a good reason for it.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/