Re: [zfs-discuss] zpool with RAID-5 from intelligent storage arrays

Al Hopper Sat, 14 Jun 2008 17:10:19 -0700

On Sat, Jun 14, 2008 at 12:11 PM, Brian Wilson <[EMAIL PROTECTED]> wrote:
>
>
>
>
>> On Sat, 14 Jun 2008, zfsmonk wrote:
>>
>> > Mentioned on
>> >
>> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
>>
>> > is the following: "ZFS works well with storage based protected LUNs
>>
>> > (RAID-5 or mirrored LUNs from intelligent storage arrays). However,
>>
>> > ZFS cannot heal corrupted blocks that are detected by ZFS
>> > checksums."
>>
>> This basically means that the checksum itself is not sufficient to
>> accomplish correction.  However if ZFS-level RAID is used, the correct
>>
>> block can be obtained from a redundant copy.
>>
>> > based upon that, if we have LUNs already in RAID5 being served from
>>
>> > intelligent storage arrays, is it any benefit to create the zpool in
>>
>> > a mirror if zfs can't heal any corrupted blocks? Or would we just be
>>
>> > wasting disk space?
>>
>> This is a matter of opinion.  If ZFS does not have access to
>> redundancy then it can not correct any problems that it encounters,
>> and could even panic the system or the entire pool could be lost.
>> However, if the storage array and all associated drivers, adaptors,
>> memory, and links are working correctly, then this risk may be
>> acceptable (to you).
>>
>> ZFS experts at Sun say that even the best storage arrays may not
>> detect and correct some problems and that complex systems can produce
>>
>> errors even though all of their components seem to be working
>> correctly.  This is in spite of Sun also making a living by selling
>> such products.  The storage array is only able to correct errors it
>> detects due to the hardware reporting an unrecoverable error condition
>>
>> or by double-checking using data on a different drive.  Since storage
>>
>> arrays want to be fast they are likely to engage additional validity
>> checks/correction only after a problem has already been reported (or
>> during a scrub/resilver) rather than as a matter of course.
>>
>> A problem which may occur is that your storage array may say that the
>>
>> data is good while ZFS says that there is bad data.  Under these
>> conditions there might not be a reasonable way to correct the problem
>>
>> other than to lose the data.  If the zfs pool requires the failed data
>>
>> in order to operate, then the entire pool could be lost.
>>
>
> Couple of questions on this topic -
>
> What's the percent of data in a zpool that if it gets one of these bit 
> corruption errors, will actually cause the zpool to fail?  Is it a 
> higher/lower percent than what it would take to fatally and irrevocably 
> corrupt UFS, or VxFS to the point where a restore is required?
>
> Given that today's storage arrays catch a good percentage of errors and 
> correct them (for the intelligent arrays I have in mind anyway), is we're 
> talking about the nasty, silent corruption I've been reading about that 
> occurs in huge datasets where the RAID thinks it's good, but it's actually 
> garbage?  From what I remember reading, that's an low occurrence rate and 
> only became noticeable because we're dealing in such large amounts of data 
> these days.  Am I wrong here?


Yes - you're "wrong" - but not because you're unintelligent or saying
something "wrong", but because you can be let down by a bad FC (Fibre
Channel) port on a switch (random noise) or by a bad optical component
in the optical path between your host system that is writing the data
and the final destination (read "expensive FC hardware SAN box") - or
a bad optical connection or a "flaky" data comm link.  Or  ... a
firmware bug (in your high $dollar SAN box) after the last (firmware)
upgrade you performed on your SAN box.

There are already well documented cases where an OP mailed the ZFS
list and said "my SAN box has been working correctly for X years, and
when I used ZFS to store data on it, ZFS "said" that the data is
"bad".  ZFS is "broken" (technical term (TM)) and not ready for prime
time.  In *all* cases, it turned out the ZFS was *not* broken and that
there was a problem somewhere in the data path, or with the SAN
hardware/firmware.  Also - look at the legacy posts and see where an
OpenSolaris developer discovered that the errors being reported by ZFS
were caused by a flaky/noisey power supply in his desktop box -
despite the fact that the particular desktop was very popular with
other (OpenSolaris) kernel developers as was widely regarded as
"fool-proof".

Its probably true to state that ZFS is the first filesystem that
allowed those high-$dollar hardware SAN vendors to actually verify
that their complex hardware/firmware chain was behaving as designed,
end-to-end.    Where "end-to-end" is defined as the data that that
host system writes is actually the data that can be retrieved N years
after its been written!

> So, looking at making operational decisions in the short term, I have to ask 
> specifically.  Is it more or less likely that a zpool will die and have to be 
> restored than UFS or VxFS filesystems on a VxVM volume?
>
> My opinions and questions are my own, and do not necessarily represent those 
> of my employer. (or my coworkers, or anyone else)
>
> cheers,
> Brian
>
>> Bob
>> ======================================
>> Bob Friesenhahn
>> [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
>> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>>
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Regards,

-- 
Al Hopper Logical Approach Inc,Plano,TX [EMAIL PROTECTED]
 Voice: 972.379.2133 Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool with RAID-5 from intelligent storage arrays

Reply via email to