This is getting pretty picky. You're saying that ZFS will detect any errors introduced after ZFS has gotten the data. However, as stated in a previous post, that doesn't guarantee that the data given to ZFS wasn't already corrupted.

If you don't trust your storage subsystem, you're going to encounter issues regardless of the software use to store data. We'll have to see if ZFS can 'save' customers in this situation. I've found that regardless of the storage solution in question you can't anticipate all issues and when a brownout or other ugly loss-of-service occurs, you may or may not be intact, ZFS or no.

I've never seen a product that can deal with all possible situations.

On Jun 27, 2006, at 9:01 AM, Jeff Victor wrote:

Unfortunately, a storage-based RAID controller cannot detect errors which occurred between the filesystem layer and the RAID controller, in either direction - in or out. ZFS will detect them through its use of checksums.

But ZFS can only fix them if it can access redundant bits. It can't tell a storage device to provide the redundant bits, so it must use its own data protection system (RAIDZ or RAID1) in order to correct errors it detects.


Gregory Shaw wrote:
Most controllers support a background-scrub that will read a volume and repair any bad stripes. This addresses the bad block issue in most cases. It still doesn't help when a double-failure occurs. Luckily, that's very rare. Usually, in that case, you need to evacuate the volume and try to restore what was damaged.
On Jun 26, 2006, at 6:40 PM, Eric Schrock wrote:
On Mon, Jun 26, 2006 at 05:26:24PM -0600, Gregory Shaw wrote:


You're using hardware raid. The hardware raid controller will rebuild the volume in the event of a single drive failure. You'd need to keep
on top of it, but that's a given in the case of either hardware or
software raid.


True for total drive failure, but not there are a more failure modes
than that. With hardware RAID, there is no way for the RAID controller
to know which block was bad, and therefore cannot repair the block.
With RAID-Z, we have the integrated checksum and can do combinatorial
analysis to know not only which drive was bad, but what the data
_should_ be, and can repair it to prevent more corruption in the future.

- Eric

--
Eric Schrock, Solaris Kernel Development http:// blogs.sun.com/ eschrock
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382              [EMAIL PROTECTED] (work)
Louisville, CO 80028-4382                 [EMAIL PROTECTED] (home)
"When Microsoft writes an application for Linux, I've Won." - Linus Torvalds
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
---------------------------------------------------------------------- ---- Jeff VICTOR Sun Microsystems jeff.victor @ sun.com
OS Ambassador            Sr. Technical Specialist
Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/ zones/faq ---------------------------------------------------------------------- ----

-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273        Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382              [EMAIL PROTECTED] (work)
Louisville, CO 80028-4382                 [EMAIL PROTECTED] (home)
"When Microsoft writes an application for Linux, I've Won." - Linus Torvalds


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to