Can I put my hand up for that?
First patch (against 2.4.5-pre1):
The "ac" patches have code to swap the rdev->desc_nr of the failed
and spare drives when swapping in a spare after it has been
reconstructed.
This is needed to keep the "rdev" structures pointing to the right
raid-superblock slot.
However it isn't quite right.
It "ac" patches avoid the swap if either spare_rdev or failed_rdev
cannot be found, which is wrong (I put it in the avoid an oops. I
didn't know the full story then).
The situation that would cause one of these to not be found is if the
"failed" drive is actually a "missing" drive. i.e. the drive failed
and then was raid_hot_removed, or it just wan't present and working
at boot time.
When this happens, we still need to set spare_rdev->desc_nr to the
correct slot number.
The following patch gets this right for raid5.c and raid1.c (The only
places where it is relevant).
NeilBrown
(more patches to come. They will go to Linus, Alan, and linux-raid only).
--- ./drivers/md/raid5.c 2001/05/15 04:02:07 1.1
+++ ./drivers/md/raid5.c 2001/05/15 04:06:19 1.2
@@ -1704,6 +1704,7 @@
struct disk_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
mdp_super_t *sb = mddev->sb;
mdp_disk_t *failed_desc, *spare_desc, *added_desc;
+ mdk_rdev_t *spare_rdev, *failed_rdev;
print_raid5_conf(conf);
md_spin_lock_irq(&conf->device_lock);
@@ -1875,6 +1876,16 @@
/*
* do the switch finally
*/
+ spare_rdev = find_rdev_nr(mddev, spare_desc->number);
+ failed_rdev = find_rdev_nr(mddev, failed_desc->number);
+
+ /* There must be a spare_rdev, but there may not be a
+ * failed_rdev. That slot might be empty...
+ */
+ spare_rdev->desc_nr = failed_desc->number;
+ if (failed_rdev)
+ failed_rdev->desc_nr = spare_desc->number;
+
xchg_values(*spare_desc, *failed_desc);
xchg_values(*fdisk, *sdisk);
--- ./drivers/md/raid1.c 2001/05/15 04:02:07 1.1
+++ ./drivers/md/raid1.c 2001/05/15 04:06:19 1.2
@@ -832,6 +832,7 @@
struct mirror_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
mdp_super_t *sb = mddev->sb;
mdp_disk_t *failed_desc, *spare_desc, *added_desc;
+ mdk_rdev_t *spare_rdev, *failed_rdev;
print_raid1_conf(conf);
md_spin_lock_irq(&conf->device_lock);
@@ -989,6 +990,16 @@
/*
* do the switch finally
*/
+ spare_rdev = find_rdev_nr(mddev, spare_desc->number);
+ failed_rdev = find_rdev_nr(mddev, failed_desc->number);
+
+ /* There must be a spare_rdev, but there may not be a
+ * failed_rdev. That slot might be empty...
+ */
+ spare_rdev->desc_nr = failed_desc->number;
+ if (failed_rdev)
+ failed_rdev->desc_nr = spare_desc->number;
+
xchg_values(*spare_desc, *failed_desc);
xchg_values(*fdisk, *sdisk);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/