The above have nothing with the O_DIRECT changes, the above was present
in 2.4.9 too.
my worry is that failing with -EIO or whatever if we written something can
screwup the app, the app will think the pos is still at the start of our
writes. the "ingore" of the osync failure (that can be generated only by
an I/O error) was on the lines of the ignore of a failure in
prepare_write/commit_write if we just written something. So to me it
looked quite intentional, not just a thinko. In those cases if we wrote
something we report "written" with a short-write (infact a short write
from kernel just indicates something is not strightforward) otherwise
only if nothing was written yet, we report -EIO. So the app, will know
something is been written and the "pos" of the fd is been updated, then
it will try again to write the remaining part and it will get the -EIO
next time.
But I see with common sense that a failing O_SYNC should be somehow
reported even if we just written something, or it could be silenty
ignored, the app at the very least should try again or to notify a
failure rather than losing the data journaling due the I/O errors in the
data/metadata flushing. At least this osync failure is something that
shouldn't happen in production. If an osync fails it means there's a bad
sector or at the very least some other unrelated software bug.
I'm unsure (it's basically a matter of API, not something a kernel
developer can choose liberally), and the SuSv2 is not saying anything about
O_SYNC failures in the write(2) manapge, but I guess it would be at
least saner to put the "pos" backwards if we fail osync but we just
written something (so if we previously advanced pos).
Comments? Andrew?
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/