Al's idea ncely dances around a big problem with the page cache: there
is no efficient way to know which address_space a given physical block
belongs to. It *might* be nice to have such capability in a
fs-independent way. We could do that now, very inefficiently, by
searching all the address_spaces (i.e., inodes) for the physical block.
We'd have to prevent further page cache operations while we did that,
and when we add fs-private address_spaces some more mechanism would be
required.. So: slow, intrusive and fragile.
The only reasonable way I can think of getting a block-coherent view
underneath a mounted fs is to have a reverse map, and update it each
time we map block into the page cache or unmap it. The reverse map
would tell us if a given physical block is currently in the page
cache,and if so, which address_space it belongs to. A blocks not
currently mapped into any address_space could be mapped into an
'anonymous' space covering the entire partition and moved automatically
to the correct address_space when the fs tries to map it.
The big problem with this mechanism is it slows down the common case,
which works perfectly well without any reverse map. Not to mention
adding bloat. So the next question I thought about was, is there a way
to switch on a page cache reverse map just when needed and do that in a
generic way. I convinced myself it wouldn't be too hard, but then
there's another question: how badly do we need this?
Al's idea does let us get at some of the specific parts of the fs
metadata but it has its problems too. We'd need to exhaustively
enumerate every kind of filesystem metadata that could reasonably be
accessed underneath the filesystem, and special-case it, not so nice.
But I couldn't come up with any killer examples where we'd really need
a generalized, coherent view underneath a mounted filesystem, so I put
these thoughts on hold. Your borked-fs example sounds interesting,
have you got more of those?
One more example I can suggest is: right now we have to way of
detecting an error condition where the same fs block is mapped into
more than one address_space. A page cache reverse map could detect
this easily and would be a really useful debugging tool.
-- Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/