Let me start this off with a personal philosophy statement. In technical 
matters, there is almost never a “best”. There only the best compromise given 
the objective you’re trying to achieve.  

If you change the objectives even slightly, you may get wildly different “best 
compromise” answers.

I came to zfs looking for the answer to the following problem:
1.      I need data backups for data chunks bigger than are practical or simple 
for DVDs.
2.      While I need a lot of bits, I don’t necessarily need the most bits I 
can afford.
3.      Elimination of silent bit-rot is a major priority.

Yes, I’m looking at a RAID for backup. But I want it as a backup store that may 
not be spinning 24/7/365. I don’t plan to use the RAID for on-line work. That 
is on the individual hard drives in the various machines being used. Another 
backup may well be on DVD or other static medium for more critical data. 

I’ll just blather a bit. The most durable data backup medium humans have come 
up with was invented about 4000-6000 years ago. It’s fired cuniform tablets as 
used in the Middle East. Perhaps one could include stone carvings of Egyptian 
and/or Maya cultures in that. 
The second most durable medium we have is ink on paper/papyrus. There are 
records that are still readable on this medium from 3000-4000 years ago. That’s 
after being weathered and buried for kiloyears, no or not much human help with 
preservation.

The modern computer era has nothing that even comes close. Our data storage 
media are largely temporary measures. We are very much like the performer I 
remember from the Ed Sullivan show (egad, I’m old!) who had an array of 
vertical rods upon which he spun ceramic plates, manipulating the rods to keep 
the plates spinning because if they ever spun down, the plate would fall off 
and break. How many of you can read a 3.5” floppy disk? A 5.25” floppy? An 8” 
floppy? A 1” data tape from the 1960s? 

Our modern data preservation relies on recopying the data onto new data media 
every so often before the mechanism to read the old medium becomes obsolete or 
irreparable and the medium itself decays. We are exactly an analog to the 
plate-spinner with our data.

Worse yet, our media are not perfect. An otherwise perfectly-written record 
will accumulate errors and eventually become unusable. Cuniform does too, but 
the scale is very, different. Having evaluated my data needs, I have some data 
that I need to be readable to modestly past my death. This puts the archival 
time in decades, not centuries or millennia. 

This line of reasoning led me to zfs. It’s for the background scrub, item #3. I 
can buy new media as it becomes available to increase the storage modestly as 
affordable (item #2). And I can’t bet on a really archival data storage 
technology becoming available. It may not get there in my lifetime.

Given that, here’s my cut on some of the questions here:
“Sweet spot for disks”: I don’t need the absolute most bits per dollar. While 
it would be nice to have that, it’s not crucial. Every disk size was once the 
sweet spot. Being a little back from the leading edge is probably a good thing 
for reliability (I have all these scars…) and when I have enough, cool. It’s 
better for my special case to have the most reliable data reading setup than it 
is to have the most bits.
“Fewer/bigger versus more/smaller drives”: Tim and Frank have worked this one 
over. I made the choice based on wanting to get a raidz3 setup, for which more 
disks are needed than raidz or raidz2. This idea comes out of the 
time-to-resilver versus time to failure line of reasoning. 
“Disk drives cost $100”: yes, I fully agree, with minor exceptions. End of 
marketing, which is where the cost per drive drops significantly, is different 
from end of life – I hope! In my case, I got 0.75TB drives for $58 each. The 
cost per bit is bigger than buying 1TB or 1.5TB drives, all right, but I can 
buy more of them, and that lets me put another drive on for the next level of 
error correction data. In my special, very limited set of criteria, a few extra 
disks are better than modestly more bits. This will obviously change when I 
have exhausted the number of bits available. Ideally, at that time, 50TB disks 
will be the sweet spot. 

Another issue I had not put out consciously until I wrote this is that this is 
a learning system for me. I’m new to Solaris and zfs. Lowest entry cost matters 
while I use up my “classroom workbook” system. Frankly, I flirted around with 
buying batches of old 100gb drives for the learner system. This had the 
advantage of (1) running the cost even lower and (2) having a few failures be 
likely to ensure that I test the recovery parts of zfs, not simply think to 
myself “Kewl, it all works fine!” right up until the first failure and find out 
that I can’t recover. Kind of like you really, really should have your fire 
extinguishers tested or replaced from time to time. 

But none of this is hard and fast math. It’s what seems right based on the 
specific set of criteria. I decided, wrong or right, that I needed to learn to 
use raidz3 for extra belts and suspenders; that suggested I needed more disk 
drives, and that made 1.5TB drives in quantity expensive. A deal on 750GB 
drives iced that one down. It was interesting, if modestly irrelevant that 
these were raid-rated drives.

If your criteria is most bits per dollar, the reasoning flips around a lot. If 
it’s fewest dollars, not fewest dollars versus learning, the reasoning changes. 
If it’s speed, it changes. If it’s thermal issues, it changes. Where you’re 
going has a huge effect on the vehicle you choose.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to