So tell us, why do you run your raid5 devices in degraded mode?? I
cannot be good for performance, and certainly isn't good for
redundancy!!! But I'm not complaining as you found a bug...
>
> Hope this gives someone an idea?
Yep. This, combined with a related bug report from n0ymv@callsign.net
strongly suggests the following patch.
Writes to the failed drive are never completing, so you eventually
run out of stripes in the stripe cache and you block waiting for a
stripe to become free.
Please test this and confirm that it works.
NeilBrown
--- ./drivers/md/raid5.c 2001/01/03 09:04:05 1.1
+++ ./drivers/md/raid5.c 2001/01/03 09:04:13
@@ -1096,8 +1096,10 @@
bh->b_rdev = bh->b_dev;
bh->b_rsector = bh->b_blocknr * (bh->b_size>>9);
generic_make_request(action[i]-1, bh);
- } else
+ } else {
PRINTK("skip op %d on disc %d for sector %ld\n", action[i]-1, i, sh->sector);
+ clear_bit(BH_Lock, &bh->b_state);
+ }
}
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/