On a recent journey of pain and frustration, I had to recover a UFS filesystem from a broken disk. The disk had many bad blocks and more were going bad over time. Sadly, there were just a few files that I wanted, but I could not mount the disk without it killing my system. (PATA disks... PITA if you ask me...)

My recovery method, though painful, might be of value in you locating the bad regions of the disk.

What I did was to kick off a script that used dd, and did something like this...

==========
#! /usr/bin/ksh

SEEK=0

while :
do
        dd if=/dev/rdsk/c0d1s7 of=backup.ufs.s7 bs=8192 \
        oseek=${SEEK} iseek=${SEEK} count=1 conv=noerror,sync
        SEEK=$((SEEK + 1))
done
==========

(Or something to that effect.)

Anyhoo - the point is that this hit the disk one block at a time(I chose 8kb, as it was the ufs block size, and 512 byte blocks looked like it would take 3 weeks), and I was ultimately able to get my data back (at least the bits I cared about...) after futzing with fsck and some other novelties.

If you were to do something similar to this, but instead of copying the block, send it to /dev/null, and log the result of dd, you could get a complete list of broken blocks.

A few botnotes:
- Yes. This is slow. WAY slow, and there are thousands of different ways that could have done this better and faster. However, it saved me from having to do anything else, and at the time, I did not feel like breaking out a compiler. Due to the massively large number of bad blocks on my disk, the size of the disk, 160GB, and the number of retries my system made for each bad block, it took 10 days (!!) to read through the whole disk 8kb at a time. - If you are happy to throw away larger blocks of disk, you could use a larger block size, which would speed things up. - If you disk really does have bad blocks that are getting in the way, chances are that it's going to get worse, and pain will ensue. I'd suggest that a new disk might be a better option. - On the new disk front, note that many hard disks come with 5 year warranties these days. If the disk is not super old, you might be able to get it replaced under warranty if you send it directly to the manufacturer...

Hope this helps at least provide some ideas. :)

Oh - and.... get a new disk. ;)

Nathan.




Patrick P Korsnick wrote:

i have a machine with a disk that has some sort of defect and i've found that 
if i partition only half of the disk that the machine will still work.  i tried 
to use 'format' to scan the disk and find the bad blocks, but it didn't work.

so as i don't know where the bad blocks are but i'd still like to use some of 
the rest of the disk, i thought ZFS might be able to help.  i partitioned the 
disk so slices 4,5,6 and 7 are each 5GB.  i thought i'd make one or multiple 
zpools on those slices and then i'd be able to narrow down where the bad 
sections are.

so my question is can i declare a zpool that spans multiple c0d0sXX but isn't a 
mirror and if i can, then will zfs be able to detect where the problem c0d0sXX 
is and not use it?  if not, i'll have to make 4 different zpools and experiment 
with storing stuff on each to find the approximate location of the bad blocks.
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to