Re: [lustre-discuss] LustreError on ZFS volumes

Jesse Stroik Mon, 12 Dec 2016 13:39:17 -0800

Thanks for taking the time to respond, Tom,

For clarification, it sounds like you are using hardware based RAID-6, and not 
ZFS raid? Is this correct? Or was the faulty card simply an HBA?



You are correct. This particular file system is still using hardware RAID6.

At the bottom of the ‘zpool status -v pool_name’ output, you may see paths 
and/or zfs object ID’s of the damaged/impacted files. This would be good to 
take note of.

Yes, I output this to files at a few different times and we've had no chance since replacing the RAID controller, which makes me feel reasonably comfortable leaving the file system in production.

There are 370 objects listed by zpool status -v but I am unable to access at least 400 files. Almost all of our files are single stripe.

Running a ‘zpool scrub’ is a good idea. If the zpool is protected with "ZFS raid", the 
scrub may be able to repair some of the damage. If the zpool is not protected with "ZFS 
raid", the scrub will identify any other errors, but likely NOT repair any of the damage.

We're not protected with ZFS RAID, just hardware raid6. I could run a patrol on the hardware controller and then a ZFS scrub if that makes the most sense at this point. This file system is scheduled to run a scrub the third week of every month so it would run one this weekend otherwise.

If you have enough disk space on hardware that is behaving properly (and free 
space in the source zpool), you may want to replicate the VDEV’s (OST) that are 
reporting errors. Having a replicated VDEV can afford you the ability to 
examine the data without fear of further damage. You may also want to extract 
certain files from the replicated VDEV(s) which are producing IO errors on the 
source VDEV.

Something like this for replication should work:

zfs snap source_pool/source_ost@timestamp_label
zfs send -Rv source_pool/source_ost@timestamp_label | zfs receive 
destination_pool/source_oat_replicated

You will need to set zfs_send_corrupt_data to 1 in /sys/module/zfs/parameters 
or the ‘zfs send’ will error and fail when sending a VDEV with read and/or 
checksum errors.
Enabling zfs_send_corrupt_data allows the zfs send operation to complete. Any 
blocks that are damaged on the source side, will have “x2f5baddb10c” replaced 
in the bad blocks on the destination side. This can be helpful in 
troubleshooting if an entire file is corrupt, or parts of the file.

After the replication, you should set the replicated VDEV to read only with 
‘zfs set readonly=on destination_pool/source_ost_replicated’


Thank you for this suggestion. We'll most likely do that.

Best,
Jesse Stroik

smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] LustreError on ZFS volumes

Reply via email to