Verifying a RAID device (Was: Are linux-fs's drive-fault-tolerant by concept?)

Peter Benie (peterb@chiark.greenend.org.uk)
Mon, 21 Apr 2003 15:11:14 +0100


John Bradford writes:
> [Chuck Ebbert writes]
> > I have some ugly code that forces all reads from a mirror set to
> > a specific copy, set via a global sysctl. This lets you do things
> > like make a backup from disk 0, then verify against disk 1 and take
> > action if something is wrong.
>
> That's interesting. Have you thought of making it read from _both_
> disks and check that the data matches, before passing it back?
>
> RAID1 mirrors guard against drive failiure, but if a drive returns bad
> data, but doesn't report an error, that will usually go unnoticed.

Checking the disks periodically guards against another important
failure mode. Consider this scenario:

Start with a RAID1 mirror using two apparently working disks.
The first disk develops a media fault, however, this goes unnoticed
because there happens to be no data stored on that part of the media.

Later, a fault is detected during a read of the second disk; the disk
is marked off off-line, but all the data is still readable on the
first disk so there's no need to panic yet.

You replace the known faulty disk with a new one. The md driver
automatically reconstructs the array from the first disk. Since the
md driver doesn't know about the filesystem, it reads every disk
block, regardless of whether it contains data or not.

The latent error on the first disk is now discovered, and the first
disk is now marked off-line. The replacement is only partially
reconstructed, so it remains inactive. Oops, there are no active disks
left - the md device has failed!

To guard against this, it is a good idea to periodically read all of
every disk in a RAID to detect faults early. Ideally, this should be
done as another background task, like reconstruction. Checking that
the data is valid is then a trivial extra step. If a read fails, you
mark the disk relevant disk off-line, as usual. If the parity check
fails, you just return a bad blocks list.

Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/