> I was trying to get you > to evaluate ZFS's > > incremental risk reduction *quantitatively* (and if > you actually > > did so you'd likely be surprised at how little > difference it makes > > - at least if you're at all rational about > assessing it). > > ok .. i'll bite since there's no ignore feature on > the list yet: > > what are you terming as "ZFS' incremental risk > reduction"? .. (seems > like a leading statement toward a particular > assumption)
Primarily its checksumming features, since other open source solutions support simple disk scrubbing (which given its ability to catch most deteriorating disk sectors before they become unreadable probably has a greater effect on reliability than checksums in any environment where the hardware hasn't been slapped together so sloppily that connections are flaky). Aside from the problems that scrubbing handles (and you need scrubbing even if you have checksums, because scrubbing is what helps you *avoid* data loss rather than just discover it after it's too late to do anything about it), and aside from problems deriving from sloppy assembly (which tend to become obvious fairly quickly, though it's certainly possible for some to be more subtle), checksums primarily catch things like bugs in storage firmware and otherwise undetected disk read errors (which occur orders of magnitude less frequently than uncorrectable read errors). Robert Milkowski cited some sobering evidence that mid-range arrays may have non-negligible firmware problems that ZFS could often catch, but a) those are hardly 'consumer' products (to address that sub-thread, which I think is what applies in Stefano's case) and b) ZFS's claimed attraction for higher-end (corporate) use is its ability to *eliminate* the need for such products (hence its ability to catch their bugs would not apply - though I can understand why people who needed to use them anyway might like to have ZFS's integrity checks along for the ride, especially when using less-than-fully-mature firmware). And otherwise undetected disk errors occur with negligible frequency compared with software errors that can silently trash your data in ZFS cache or in application buffers (especially in PC environments: enterprise software at least tends to be more stable and more carefully controlled - not to mention their typical use of ECC RAM). So depending upon ZFS's checksums to protect your data in most PC environments is sort of like leaving on a vacation and locking and bolting the back door of your house while leaving the front door wide open: yes, a burglar is less likely to enter by the back door, but thinking that the extra bolt there made you much safer is likely foolish. .. are you > just trying to say that without multiple copies of > data in multiple > physical locations you're not really accomplishing a > more complete > risk reduction What I'm saying is that if you *really* care about your data, then you need to be willing to make the effort to lock and bolt the front door as well as the back door and install an alarm system: if you do that, *then* ZFS's additional protection mechanisms may start to become significant (because you're eliminated the higher-probability risks and ZFS's extra protection then actually reduces the *remaining* risk by a significant percentage). Conversely, if you don't care enough about your data to take those extra steps, then adding ZFS's incremental protection won't reduce your net risk by a significant percentage (because the other risks that still remain are so much larger). Was my point really that unclear before? It seems as if this must be at least the third or fourth time that I've explained it. > > yes i have read this thread, as well as many of your > other posts > around usenet and such .. in general i find your tone > to be somewhat > demeaning (slightly rude too - but - eh, who's > counting? i'm none to > judge) As I've said multiple times before, I respond to people in the manner they seem to deserve. This thread has gone on long enough that there's little excuse for continued obtuseness at this point, but I still attempt to be pleasant as long as I'm not responding to something verging on being hostile. - now, you do know that we are currently in an > era of > collaboration instead of deconstruction right? Can't tell it from the political climate, and corporations seem to be following that lead (I guess they've finally stopped just gazing in slack-jawed disbelief at what this administration is getting away with and decided to cash in on the opportunity themselves). Or were you referring to something else? .. so > i'd love to see > the improvements on the many shortcomings you're > pointing to and > passionate about written up, proposed, and freely > implemented :) Then ask the ZFS developers to get on the stick: fixing the fragmentation problem discussed elsewhere should be easy, and RAID-Z is at least amenable to a redesign (though not without changing the on-disk metadata structures a bit - but while they're at it, they could include support for data redundancy in a manner analogous to ditto blocks so that they could get rid of the vestigial LVM-style management in that area). Changing ZFS's approach to snapshots from block-oriented to audit-trail-oriented, in order to pave the way for a journaled rather than shadow-paged approach to transactional consistency (which then makes data redistribution easier to allow rebalancing across not only local disks but across multiple nodes using algorithmic rather than pointer-based placement) starts to get more into a 'raze it to the ground and start over' mode, though - leaving plenty of room for one or more extended postscripts to 'the last word in file systems'. - bill This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss