On Sun, 31 Jul 2011 15:49:48 -0400 Scott Schaefer <saschae...@neurodiverse.org> wrote:
> I am glad that you phrased your request "It would better if it managed > to say it failed doing the requested operation.". > > Because it indeed did successfully perform the operation, exactly as > the output indicated. That is, it DID indeed set the MD_DISK_FAULTY > attribute on the /dev/sdb2 device of the /dev/md0 array. > > To be more precise, it set the attribute via ioctl() call to the > kernel 'md' driver. (~ lines 980-995 of Manage.c). > > Unfortunately, (or rather, fortunately, for your data as well as > your blood pressure), the kernel 'md' driver, when receiving this > request, sets flag to initiate a recovery, or, if a recovery is > already in progress (as in your case), sets flag for > MD_RECOVERY_RECOVER. > > I have not attempted to understand all the possibilities in the > kernel driver. However, it appears that, at least for RAID-1, > the FAULTY flag on the (sdb2) device is cleared when the recovery > completes, and the 'RECOVERY_RECOVER' finds nothing more to do. > > At this point, I believe this a "won't fix" issue; one could > potentially ask for mdadm to do some before/after status-check > magic and "handle" this and other potential cases in some > "better" way. Asking it to do so raises a great deal many more > problems than it solves. I've just queued the following kernel patch which will be in 3.1 which I believe is the best way to address this issue. Thanks, NeilBrown From 70792a4e8fc486ab82449cb3165268131875b7c1 Mon Sep 17 00:00:00 2001 From: NeilBrown <ne...@suse.de> Date: Mon, 1 Aug 2011 12:28:41 +1000 Subject: [PATCH] md: report failure if a 'set faulty' request doesn't. Sometimes a device will refuse to be set faulty. e.g. RAID1 will never let the last working device become faulty. So check if "md_error()" did manage to set the faulty flag and fail with EBUSY if it didn't. Resolves-Debian-Bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=601198 Reported-by: Mike Hommey <mh+report...@glandium.org> Signed-off-by: NeilBrown <ne...@suse.de> diff --git a/drivers/md/md.c b/drivers/md/md.c index 8e221a2..1cd9bfb 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2561,7 +2561,10 @@ state_store(mdk_rdev_t *rdev, const char *buf, size_t len) int err = -EINVAL; if (cmd_match(buf, "faulty") && rdev->mddev->pers) { md_error(rdev->mddev, rdev); - err = 0; + if (test_bit(Faulty, &rdev->flags)) + err = 0; + else + err = -EBUSY; } else if (cmd_match(buf, "remove")) { if (rdev->raid_disk >= 0) err = -EBUSY; @@ -5983,6 +5986,8 @@ static int set_disk_faulty(mddev_t *mddev, dev_t dev) return -ENODEV; md_error(mddev, rdev); + if (!test_bit(Faulty, &rdev->flags)) + return -EBUSY; return 0; } -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org