On Fri, Aug 19, 2011 at 04:50:01PM -0400, Dan Langille wrote: > System in question: FreeBSD 8.2-STABLE #3: Thu Mar 3 04:52:04 GMT 2011 > > After a recent power failure, I'm seeing this in my logs: > > Aug 19 20:36:34 bast smartd[1575]: Device: /dev/ad2, 2 Currently unreadable > (pending) sectors
I doubt this is related to a power failure. > Searching on that error message, I was led to believe that identifying the > bad sector and > running dd to read it would cause the HDD to reallocate that bad block. > > http://smartmontools.sourceforge.net/badblockhowto.html This is incorrect (meaning you've misunderstood what's written there). Unreadable LBAs can be a result of the LBA being actually bad (as in uncorrectable), or the LBA being marked "suspect". In either case the LBA will return an I/O error when read. If the LBAs are marked "suspect", the drive will perform re-analysis of the LBA (to determine if the LBA can be read and the data re-mapped, or if it cannot then the LBA is marked uncorrectable) when you **write** to the LBA. The above smartd output doesn't tell me much. Providing actual SMART attribute data (smartctl -a) for the drive would help. The brand of the drive, the firmware version, and the model all matter -- every drive behaves a little differently. Furthermore, if the LBA is re-analysed and determined to be uncorrectable -- regardless of remapping -- this doesn't actually fix I/O errors on a filesystem level. The filesystem itself (and more often than not in the data section of the file/inode, so things like fsck can't work around this) can still contain references to the LBA which is uncorrectable, and will still continue to return I/O errors when read. There has to be a way to tell the filesystem, when formatted, "avoid use of this LBA". How UFS/FFS handles this is unknown to me. I know of badsect(8) but I don't know if this works. "Transparent" remapping I have never seen work except on SSDs. If you want me to step you through the procedure of re-testing the LBAs (assuming they're suspect and not uncorrectable) I can do so, just ask. Finding the suspect LBAs can be done using a dd loop (I wrote a shell script for this), or using "smartctl -t select,0-max /dev/XXX" and let the drive's internal selective test see if it can find them. From there it's an issue of submitting a write request to the LBA and seeing what happens (I do this via dd as well, but the parameters you pass it are very specific, e.g. don't mix up/misunderstand seek vs. skip). I've assisted with this time and time again for folks on forums with varying success. I've also found some models of drives which claim there's suspect LBAs yet an internal surface scan passes with no issues (and these are drives which I myself have, the only difference between my drives and the individuals' drive is firmware, which leads me to believe a bug on some drives in the field). -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"