That's also a bug.
So imagine that there is another user keeping the bdev active. Implying
that you never even try to sync it at all. That sounds like a bad thing to
do.
> > data after the first one has done __block_fsync? And the truncate will
> > throw the dirtied page away?
>
> There can't be any truncate because the blkdev isn't a regular file.
You said you used truncate_inode_pages(), which _does_ throw the pages
away. Dirty or not.
What I'm saying is that you at _every_ close (sane semantics for block
devices really do expect the writes to be flushed by close time - how
would they otherwise ever be flushed reliably?) do something like
fsync(inode);
invalidate_inode_pages(inode);
invalidate_device(inode->b_dev);
and be done with it. That syncs the pages that we've dirtied, and it
invalidates all pages that aren't pinned some way. Which is exactly what
you want.
> that's definitely not enough, see the other issue mentioned by Andreas
> in this thread, the reason I wrote the algorithm I explained in the
> previous email is as first thing to eventually avoid infinite long fsck
> of the root fs.
Ehh? Why? The above writes back exactly the same thing that our current
block filesystem writes back. While "invalidate_device()" also throws away
all buffers that aren't pinned.
And the superblock isn't in the buffer cache - it's cached separately, so
invalidate_device() will throw away the buffer associated with it - to be
re-read and re-written by the rw remount.
Will it be different than the current behaviour wrt some other metadata?
Yes. So you could make invalidate_device() stronger, trying to re-read
buffers that aren't dirty. But that doesn't mean that you should act
differently on FS mounted vs not-mounted vs some-other-user.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/