Re: [BUG] Failed writes marked clean?

Theodore Ts'o (tytso@mit.edu)
Fri, 8 Nov 2002 18:35:30 -0500


On Fri, Nov 08, 2002 at 12:57:19PM -0800, Andrew Morton wrote:
> Well before going and changing stuff, we need to decide what to
> change it _to_. What do we want to happen if there's a read error?
> And a write error?
>
> For reads, it makes sense for the page/buffer to be left not uptodate,
> and return an error.

In some circumstances, it may actually make sense to try writing a
random block of data to the disk, since that may force the disk to
remap the block. (Disks generally only remap a block from the pool of
spare blocks on writes, not on reads.)

Unfortuantely, if the error was just a transient one, you might end up
smashing the block when you write random garbage in an attempt to
remap the block. So perhaps the answer is to retry the read, and if
that fails, *then* try to do a forced rewrite of the block.

The next question is whether to do this in userspace or in the kernel.
And if in the kernel, whether it should be done at the device driver
layer, or in the block I/O layer, or in the filesystem?

I can make a case for doing it in userspace, since that gives us the
most amount of flexibility, and it gives us ample opportunity to do
special things, such as paging an operator for help, etc. On the
other hand, there are arguments for doing it in the kernel. It may be
that an appropriately clever filesystem might be able to do more
intelligent recovery while keeping the filesystem mounted.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/