Look again - for regular files we _never_ fsync it. Not on the first
close, not on the last one.
And I'm telling you that device files are different. Things like fdisk and
fsck expect the data to be on the disk after they've closed the device.
They do not expect the data to be on disk only if they were the only
opener, or other special cases.
End of story.
> The whole point is that the two users obviously are sharing the same
> cache so there's no issue there.
There may not be any issue for _them_, but there may easily be an issue
between the user that does the first close, and the next
non-page-cache-user.
This is not about coherency between those two users. That coherency is
something we can take for granted if they both use the page cache, and as
such you shouldn't even bring that issue up - it's not interesting.
What _is_ interesting is the assumption of the disk state of _one_ of the
users, and its expectations about the disk contents etc being up-to-date
after a close(). Up-to-date enough that some other user that does NOT use
the page cache (ie a mount) should see the correct data.
And it sounds very much like your current approach completely misses this.
> The only case we need to care specially is the coherency between
> filesystems and blkdev users and that's what I do with the algorithm
> explained earlier.
Yeah, you explained that you had multiple different approaches depending
on whether it was mounted read-only or not etc etc. Which sounds really
silly to me.
I'm telling you that ONE approach should be the right one. And I'm further
telling you that that one approach is obviously a superset of your "many
different modes of behaviour depending on what is going on".
Which sounds a lot cleaner, and a lot simpler than your special case
heaven.
> We don't flush at every close even in current 2.4 buffercache backed.
Because we don't need to. There are no coherency issues wrt metadata
(which is the only thing that fsck changes).
If fsck changed data contents, we'd better flush dirty buffers, believe
me.
> > fsync(inode);
> > invalidate_inode_pages(inode);
> > invalidate_device(inode->b_dev);
>
> That's not enough.
Tell me why. Yes, you need the extension that I pointed out:
> > Yes. So you could make invalidate_device() stronger, trying to re-read
> > buffers that aren't dirty. But that doesn't mean that you should act
> > differently on FS mounted vs not-mounted vs some-other-user.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/