On 04/23/12 01:47 PM, Manuel Ryan wrote:
Hello, I have looked around this mailing list and other virtual spaces
and I wasn't able to find a similar situation than this weird one.
I have a 6 disks raidz zfs15 pool. After a scrub, the status of the
pool and all disks still show up as "ONLINE" but two of the disks are
starting to give me errors and I do have fatal data corruption. The
disks seems to be failing differently :
disk 2 has 78 (not growing) read errors, 43k (growing) write errors
and 3 (not growing) checksum errors.
disk 5 has 0 read errors, 0 write errors but 7.4k checksum errors
(growing).
Data corruption is around 22k files.
I plan to replace both disks. Which disk do you think should be
replaced first to loose as few data as possible ?
I was thinking of replacing disk 5 first as it seems to have a lot of
"silent" data corruption so maybe it's a bad idea to use it's output
to replace disk 2. Also checksum and read errors on disk 2 do not seem
to be growing as I used the pool to backup data (corrupted files could
not be accessed, but a lot of files were fine) but write errors are
growing extremely fast. So reading uncorrupted data from disk 2 seems
to be working but writing on it seems to be problematic.
Do you guys also think I should change disk 5 first or am I missing
something ?
If it were my data, I'd set the pool read only, backup, rebuild and
restore. You do risk further data loss (maybe even pool loss) while the
new drive is resilvering.
I would only use raidz for unimportant data, or for a copy of data from
a more robust pool.
--
Ian.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss