This is getting pretty picky. You're saying that ZFS will detect any
errors introduced after ZFS has gotten the data. However, as stated
in a previous post, that doesn't guarantee that the data given to ZFS
wasn't already corrupted.
If you don't trust your storage subsystem, you're going to encounter
issues regardless of the software use to store data. We'll have to
see if ZFS can 'save' customers in this situation. I've found that
regardless of the storage solution in question you can't anticipate
all issues and when a brownout or other ugly loss-of-service occurs,
you may or may not be intact, ZFS or no.
I've never seen a product that can deal with all possible situations.
On Jun 27, 2006, at 9:01 AM, Jeff Victor wrote:
Unfortunately, a storage-based RAID controller cannot detect errors
which occurred between the filesystem layer and the RAID
controller, in either direction - in or out. ZFS will detect them
through its use of checksums.
But ZFS can only fix them if it can access redundant bits. It
can't tell a storage device to provide the redundant bits, so it
must use its own data protection system (RAIDZ or RAID1) in order
to correct errors it detects.
Gregory Shaw wrote:
Most controllers support a background-scrub that will read a
volume and repair any bad stripes. This addresses the bad block
issue in most cases.
It still doesn't help when a double-failure occurs. Luckily,
that's very rare. Usually, in that case, you need to evacuate
the volume and try to restore what was damaged.
On Jun 26, 2006, at 6:40 PM, Eric Schrock wrote:
On Mon, Jun 26, 2006 at 05:26:24PM -0600, Gregory Shaw wrote:
You're using hardware raid. The hardware raid controller will
rebuild
the volume in the event of a single drive failure. You'd need
to keep
on top of it, but that's a given in the case of either hardware or
software raid.
True for total drive failure, but not there are a more failure modes
than that. With hardware RAID, there is no way for the RAID
controller
to know which block was bad, and therefore cannot repair the block.
With RAID-Z, we have the integrated checksum and can do
combinatorial
analysis to know not only which drive was bad, but what the data
_should_ be, and can repair it to prevent more corruption in the
future.
- Eric
--
Eric Schrock, Solaris Kernel Development http://
blogs.sun.com/ eschrock
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382 [EMAIL PROTECTED] (work)
Louisville, CO 80028-4382 [EMAIL PROTECTED] (home)
"When Microsoft writes an application for Linux, I've Won." -
Linus Torvalds
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--
----------------------------------------------------------------------
----
Jeff VICTOR Sun Microsystems jeff.victor @
sun.com
OS Ambassador Sr. Technical Specialist
Solaris 10 Zones FAQ: http://www.opensolaris.org/os/community/
zones/faq
----------------------------------------------------------------------
----
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive MS 4382 [EMAIL PROTECTED] (work)
Louisville, CO 80028-4382 [EMAIL PROTECTED] (home)
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss