actually the lowlevel currently is supposed to take care of that if it
writes directly with ll_rw_block like fsync_buffers_list takes care of
it before calling ll_rw_block. But the whole point is that normally the
write_inode path only marks the buffer dirty, it never writes directly,
and no dirty bitflag can be lost and we flush those dirty buffers right
with the proper wait_on_buffer before ll_rw_block. So I don't think it's
happening really in 2.4, at least w/o mounting the fs with -osync.
But I thought about about Mikulas suggestion of adding lock_buffer
in place of the test and set bit. This looks very appealing. the main
reason we didn't do that while fixing fsync_buffers_list a few months
ago in 2.4.20 or so, is been that it is a very risky change, I mean, I'm
feeling like it'll break something subtle.
In theory the only thing those bitflags controls are the coherency of
the buffer cache here, the rest is serialized by design at the
highlevel. And the buffer_cache should be ok with the lock_buffer(),
since we'll see the buffer update and we'll skip the write if we race
with other reads (we'll sleep in lock_buffer rather than in
wait_on_buffer). For the writes we'll overwrite the data one more time
(the feature incidentally ;). so it sounds safe. Performance wise it
shouldn't matter, for read it can't matter infact.
So I guess it's ok to make that change, then we could even move the
wait_on_buffer from fsync_buffers_list to the xfs buffer_delay path of
write_buffer. I'm tempted to make this change for next -aa. I feel it
makes more sense and it's a better fix even if if risky. Can anybody see
a problem with that?
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/