On Wednesday, February 02, 2011 02:06:15 am Chuck Munro wrote:
> The real key is to carefully label each SATA cable and its associated 
> drive.  Then the little mapping script can be used to identify the 
> faulty drive which mdadm reports by its device name.  It just occurred 
> to me that whenever mdadm sends an email report, it can also run a 
> script which groks out the path info and puts it in the email message. 
> Problem solved :-)

Ok, perhaps I'm dense, but, if this is not a hot-swap bay you're talking about, 
wouldn't it be easier to have the drive's serial number (or other identifier 
found on the label) pulled into the e-mail, and compare with the label 
physically found on the drive, since you're going to have to open the case 
anyway?  Using something like: 

hdparm -I $DEVICE | grep Serial.Number

works here (the regexp Serial.Number matches the string "Serial Number" without 
requiring the double quotes....).  Use whatever $DEVICE you need to use, as 
long as it's on a controller compatible with hdparm usage. 

I have seen cases with a different Linux distribution where the actual module 
load order was nondeterministic (modules loaded in parallel); while upstream 
and the CentOS rebuild try to make things more deterministic, wouldn't it be 
safer to get a really unique, externally visible identifier from the drive?  If 
the drive has failed to the degree that it won't respond to the query, then 
query all the good drives in the array for their serial numbers, and use a 
process of elimination.  This, IMO, is more robust than relying on the drive 
detect order to remain deterministic.

If in a hotswap or coldswap bay, do some data access to the array, and see 
which LED's don't blink; that should correspond to the failed drive.  If the 
bay has secondary LED's, you might be able to blink those, too.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Reply via email to