On 04/23/12 01:47 PM, Manuel Ryan wrote:
Hello, I have looked around this mailing list and other virtual spaces and I wasn't able to find a similar situation than this weird one.

I have a 6 disks raidz zfs15 pool. After a scrub, the status of the pool and all disks still show up as "ONLINE" but two of the disks are starting to give me errors and I do have fatal data corruption. The disks seems to be failing differently :

disk 2 has 78 (not growing) read errors, 43k (growing) write errors and 3 (not growing) checksum errors.

disk 5 has 0 read errors, 0 write errors but 7.4k checksum errors (growing).

Data corruption is around 22k files.

I plan to replace both disks. Which disk do you think should be replaced first to loose as few data as possible ?

I was thinking of replacing disk 5 first as it seems to have a lot of "silent" data corruption so maybe it's a bad idea to use it's output to replace disk 2. Also checksum and read errors on disk 2 do not seem to be growing as I used the pool to backup data (corrupted files could not be accessed, but a lot of files were fine) but write errors are growing extremely fast. So reading uncorrupted data from disk 2 seems to be working but writing on it seems to be problematic.

Do you guys also think I should change disk 5 first or am I missing something ?

If it were my data, I'd set the pool read only, backup, rebuild and restore. You do risk further data loss (maybe even pool loss) while the new drive is resilvering.

I would only use raidz for unimportant data, or for a copy of data from a more robust pool.

--
Ian.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to