That is certainly the case. If the application is seekily writing
to a file then we currently lay the file out on-disk in the order
in which the application seeked. So reading the file back
linearly is very slow.
Now this is not necessarily a bad thing - if the file was created
seekily then it will probably be _used_ seekily so no big
loss probably.
This problem is pretty unsolvable for filesystems which map blocks
to their disk address at write(2) time. It can be solved for
allocate-on-flush filesystems via a sort of the dirty page list,
or by maintaining ->dirty_pages in a tree or whatever.
There is one "file" where this problem really does matter - the
blockdev mapping "/dev/hda1". It is both highly fragmented and
poorly sorted on the dirty_pages list.
It's pretty trivial to perform a sillysort at writeout time:
if we just wrote page N and the next page isn't N+1 then do a
pagecache probe for "N+1". That's probably sufficient. If
not, there's a simple little sort routine over at
http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
which is appropriate to our lists.
I'll be taking a look at the sillysort option once I've cleared away
some other I/O scheduling glitches.
-
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/