Re: [zfs-discuss] ZFS offline ZIL corruption not detected

George Wilson Thu, 26 Aug 2010 07:55:44 -0700

Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Neil Perrin


This is a consequence of the design for performance of the ZIL code.
Intent log blocks are dynamically allocated and chained together.
When reading the intent log we read each block and checksum it
with the embedded checksum within the same block. If we can't read
a block due to an IO error then that is reported, but if the checksum
does
not match then we assume it's the end of the intent log chain.
Using this design means we the minimum number of writes to add
write an intent log record is just one write.

So corruption of an intent log is not going to generate any errors.


I didn't know that.  Very interesting.  This raises another question ...

It's commonly stated, that even with log device removal supported, the most
common failure mode for an SSD is to blindly write without reporting any
errors, and only detect that the device is failed upon read.  So ... If an
SSD is in this failure mode, you won't detect it?  At bootup, the checksum
will simply mismatch, and we'll chug along forward, having lost the data ...
(nothing can prevent that) ... but we don't know that we've lost data?

If the drive's firmware isn't returning back a write error of any kindthen there isn't much that ZFS can really do here (regardless of whetherthis is an SSD or not). Turning every write into a read/write operationwould totally defeat the purpose of the ZIL. It's my understanding thatSSDs will eventually transition to read-only devices once they'veexceeded their spare reallocation blocks. This should propagate to theOS as an EIO which means that ZFS will instead store the ZIL data on themain storage pool.


Worse yet ... In preparation for the above SSD failure mode, it's commonly
recommended to still mirror your log device, even if you have log device
removal.  If you have a mirror, and the data on each half of the mirror
doesn't match each other (one device failed, and the other device is good)
... Do you read the data from *both* sides of the mirror, in order to
discover the corrupted log device, and correctly move forward without data
loss?

Yes, we read all sides of the mirror when we claim (i.e. read) the logblocks for a log device. This is exactly what a scrub would do for amirrored data device.


- George


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZFS offline ZIL corruption not detected

Reply via email to