1. getattr request goes out to get file size. Value will be
stale compared to inode->i_size, since writes are happening.
2. All writebacks for the inode complete.
3. getattr response returns with stale file size value.
4. __nfs_refresh_inode checks writebacks, finds none,
overwrites inode->i_size.
5. generic_file_write resets file position (O_APPEND) with
stale file size, overwriting previously written data.
Race is theoretically present in 2.2 as well, but the delay-based
flushing in 2.2 rarely flushes all writes away, so there are usually
requests on the writeback list. Only became a problem in 2.4 with
more aggressive flushing clearing out all writebacks.
Here's a patch (2.4.16) that works for us, it eliminates the race in the
most common case. BKL and NFS_REVALIDATING prevent new writes from
coming in while the getattr is happening, so we just need to check for
writebacks before starting the getattr. Not a perfect solution, there's
still a potential for races with direct calls to nfs_refresh_inode that
bypass nfs_revalidate_inode. In practice, we're not triggering those
operations with any frequency.
--- linux/fs/nfs/inode.c Fri Nov 9 14:28:15 2001
+++ linux-nfsappendrace/fs/nfs/inode.c Wed Dec 5 17:12:28 2001
@@ -868,8 +868,9 @@
__nfs_revalidate_inode(struct nfs_server *server, struct inode *inode)
{
int status = -ESTALE;
struct nfs_fattr fattr;
+ int writebacks;
dfprintk(PAGECACHE, "NFS: revalidating (%x/%Ld)\n",
inode->i_dev, (long long)NFS_FILEID(inode));
@@ -889,8 +890,9 @@
}
}
NFS_FLAGS(inode) |= NFS_INO_REVALIDATING;
+ writebacks = nfs_have_writebacks(inode);
status = NFS_PROTO(inode)->getattr(inode, &fattr);
if (status) {
dfprintk(PAGECACHE, "nfs_revalidate_inode: (%x/%Ld) getattr failed, error=%d\n",
inode->i_dev, (long long)NFS_FILEID(inode), status);
@@ -900,8 +902,11 @@
remove_inode_hash(inode);
}
goto out;
}
+
+ if ( writebacks && nfs_size_to_loff_t(fattr.size) < inode->i_size )
+ fattr.size = (__u64) inode->i_size;
status = nfs_refresh_inode(inode, &fattr);
if (status) {
dfprintk(PAGECACHE, "nfs_revalidate_inode: (%x/%Ld) refresh failed, error=%d\n",
Please cc on responses, thanks!
Xeno
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/