yes, it's definitely sane.
> we allocate that stuff dynamically. With your variant we _must_ call
> invalidate_inodes() here to force eviction from icache. What's more,
> not calling it will screw up non-deterministically - it will survive
> if inode gets evicted in the right interval and produce whatever damage
> it's going to produce if eviction happens too late.
>
> Again, what we really want here is "don't keep inodes dropped during
> ->read_super() or ->put_super() in icache". You propose to stick
> invalidate_inodes() in a bunch of places so that it would kill these
> inodes before it's too late. For some filesystems it would be
correct.
> covered by ones you add in fs/super.c, for other it would need
> explicit calls, required positions may depend on the fs internals
> and change with them... What I propose is "don't wait, kill them
> immediately and forget about the whole thing".
I don't like bloating iput with code just needed for backwards
compatibility with buggy code and for a non common case (infact it's not
even backwards compatibiltiy, if something it would be instead bugwards
compatibility 8).
OTOH I see the iget/iput semantics within the read_super/put_super (not
the code!) would be cleaner your way, so we don't even need to document
that if a per-sb ->clear_inode needs to access any per-sb special
structures, invalidate_inodes(sb) must be called before freeing the
per-sb structures [always ignoring any possible flush] (plus the fact
sb->s_op must not be clobbered before returning from read_super, so that
the vfs can see the s_op->clear_inode, but nobody should do that
anyways). btw, even your way we should still make sure nobody is
calling iput after freeing the per-sb structures :).
In practice I had a very short look at all the ->clear_inode implemented
and none seems to need the special per-sb structures, so I think also my
patch is just fine for current tree 2.4. (I guess somebody should check
xfs and jfs too)
I think long term (2.5) the right way is to replace all the iput in the
slow fail paths with a iput_not_mounted, that will avoid both the iput
clobbering and the MS_ACTIVE tracking. The differentiation should be
quite self documenting (so people will taste that this iput_not_mounted
is kind of raw thing that will flush + ->clear_inode and destroy the
inode synchronously, so ->clear_inode has to work while recalling
iput_not_mounted). It should be very easy to identify the iputs in the
read_super/put_super paths to replace them with the iput_not_mounted (at
least for the normal fs like ext2/minix etc..).
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/