On 03/02/2010 23:15, Aleksandr Levchuk wrote:
We switched to OpenSolaris + ZFS. RAID6 + hot spare on LSI Engenio san
hardware, worked well for us. (I'm used to the san management GUI. Also,
something that RAID-Z would not be able to do is: the san lights-up the amber
LEDs on the drives that fail, so I know which one to replace.)
So, I wanted to try to stick to the hardware RAID for data protection. I
understand that the end-to-end checks of ZFS make it better at detecting
corruptions.
In my case, I can imagine that ZFS would FREEZ the whole volume when a single
block or file is found to be corrupted.
Ideally, I would not like this to happen and instead would like to get a log
with names of corrupted files.
What exactly does happens when
zfs detects a corrupted block/file and does not have redundancy to correct it?
Alex
Your wish is...
that's exactly what should happen - zpool status -v should provide you
with list of affected files which you should be able to delete. in case
of corrupted block contained meta-data zfs should actually be able to
fix it on the fly for you as all meta-data related block are kept in at
least two copies even if no redundancy is configured at pool level.
Let's test it:
mi...@r600:~# mkfile 128m file1
mi...@r600:~# zpool create test `pwd`/file1
mi...@r600:~# zpool status test
pool: test
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
/export/home/milek/file1 ONLINE 0 0 0
errors: No known data errors
mi...@r600:~#
mi...@r600:~# cp /bin/bash /test/file1
mi...@r600:~# cp /bin/bash /test/file2
mi...@r600:~# cp /bin/bash /test/file3
mi...@r600:~# cp /bin/bash /test/file4
mi...@r600:~# cp /bin/bash /test/file5
mi...@r600:~# cp /bin/bash /test/file6
mi...@r600:~# cp /bin/bash /test/file7
mi...@r600:~# cp /bin/bash /test/file8
mi...@r600:~# cp /bin/bash /test/file9
mi...@r600:~# sync
mi...@r600:~# dd if=/dev/zero of=file1 seek=50 count=10000 conv=notrunc
10000+0 records in
10000+0 records out
5120000 bytes (5.1 MB) copied, 0.179617 s, 28.5 MB/s
mi...@r600:~# sync
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
pool: test
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: scrub completed after 0h0m with 7 errors on Thu Feb 4 00:18:40
2010
config:
NAME STATE READ WRITE CKSUM
test DEGRADED 0 0 7
/export/home/milek/file1 DEGRADED 0 0 29 too many errors
errors: Permanent errors have been detected in the following files:
/test/file1
mi...@r600:~#
mi...@r600:~# rm /test/file1
mi...@r600:~# sync
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
pool: test
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed after 0h0m with 0 errors on Thu Feb 4 00:19:55
2010
config:
NAME STATE READ WRITE CKSUM
test DEGRADED 0 0 7
/export/home/milek/file1 DEGRADED 0 0 29 too many errors
errors: No known data errors
mi...@r600:~# zpool clear test
mi...@r600:~# zpool scrub test
mi...@r600:~# zpool status -v test
pool: test
state: ONLINE
scrub: scrub completed after 0h0m with 0 errors on Thu Feb 4 00:20:12
2010
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
/export/home/milek/file1 ONLINE 0 0 0
errors: No known data errors
mi...@r600:~#
mi...@r600:~# ls -la /test/
total 7191
drwxr-xr-x 2 root root 10 2010-02-04 00:19 .
drwxr-xr-x 28 root root 30 2010-02-04 00:17 ..
-r-xr-xr-x 1 root root 799040 2010-02-04 00:17 file2
-r-xr-xr-x 1 root root 799040 2010-02-04 00:17 file3
-r-xr-xr-x 1 root root 799040 2010-02-04 00:17 file4
-r-xr-xr-x 1 root root 799040 2010-02-04 00:17 file5
-r-xr-xr-x 1 root root 799040 2010-02-04 00:17 file6
-r-xr-xr-x 1 root root 799040 2010-02-04 00:17 file7
-r-xr-xr-x 1 root root 799040 2010-02-04 00:18 file8
-r-xr-xr-x 1 root root 799040 2010-02-04 00:18 file9
mi...@r600:~#
--
Robert Milkowski
htpp://milek.blogspot.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss