> On 9/15/06, can you guess? <[EMAIL PROTECTED]> > wrote: ...
file-level, however, is really pushing > it. You might end > up with an administrative nightmare deciphering which > files have how > many copies.\ I'm not sure what you mean: the level of redundancy would be a per-file attribute that could be examined, and would be normally just be defaulted to a common value. ... > > It would be interesting to know whether that would > still be your experience in environments that > regularly scrub active data as ZFS does (assuming > that said experience was accumulated in environments > that don't). The theory behind scrubbing is that all > data areas will be hit often enough that they won't > have time to deteriorate (gradually) to the point > where they can't be read at all, and early > deterioration encountered during the scrub pass (or > other access) in which they have only begun to become > difficult to read will result in immediate > revectoring (by the disk or, if not, by the file > system) to healthier locations. > > Scrubbing exercises the disk area to prevent bit-rot. > I do not think > FS's scrubbing changes the failure mode of the raw > devices. It doesn't change the failure rate (if anything, it might accelerate it marginally due to the extra disk activity), but it *does* change, potentially radically, the frequency with which sectors containing user data become unreadable - because it allows them to be detected *before* that happens such that the data can be moved to a good sector (often by the disk itself, else by higher-level software) and the failing sector marked bad. OTOH, I > really have no such experience to speak of *fingers > crossed*. I > failed to locate the code where the relocation of > files happens but > assume that copies would make this process more > reliable. Sort of: while they don't make any difference when you catch a failing sector while it's still readable, they certainly help if you only catch it after it's become unreadable (or has been 'silently' corrupted). > > > Since ZFS-style scrubbing detects even > otherwise-indetectible 'silent corruption' missed by > the disk's own ECC mechanisms, that lower-probability > event is also covered (though my impression is that > the probability of even a single such sector may be > significantly lower than that of whole-disk failure, > especially in laptop environments). > > I do not any data to support nor dismiss that. Quite a few years ago Seagate still published such data, but of course I didn't copy it down (because it was 'always available' when I wanted it - as I said, it was quite a while ago and I was not nearly as well-acquainted with the volatility of Internet data as I would subsequently become). But to the best of my recollection their enterprise disks at that time were specced to have no worse than 1 uncorrectable error for every petabit read and no worse than 1 undetected error for every exabit read. A fairly recent paper by people who still have access to such data suggests that the frequency of uncorrectable errors in enterprise drives is still about the same, but that the frequency of undetected errors may have increased markedly (to perhaps once in every 10 petabits read) - possibly a result of ever-increasing on-disk bit densities and the more aggressive error correction required to handle them (perhaps this is part of the reason they don't make error rates public any more...). They claim that SATA drives have error rates around 10x that of enterprise drives (or an undetected error rate of around once per petabit). Figure out a laptop drive's average data rate and that gives you a mean time to encountering undetected corruption. Compare that to the drive's in-use MTBF rating and there you go! If I haven't dropped a decimal place or three doing this in my head, then even if laptop drives have nominal MTBFs equal to desktop SATA drives it looks as if it would take an average data rate of 60 - 70 KB/sec (24/7, year-in, year-out) for the likelihood of an undetected error to be comparable in likelihood to a whole-disk failure: that's certainly nothing much for a fairly well-loaded server in constant (or even just 40 hour/week) use, but for a laptop?. - bill This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss