On Fri, 2010-04-09 at 08:07 +1000, Daniel Carosone wrote: > On Thu, Apr 08, 2010 at 12:14:55AM -0700, Erik Trimble wrote: > > Daniel Carosone wrote: > >> Go with the 2x7 raidz2. When you start to really run out of space, > >> replace the drives with bigger ones. > > > > While that's great in theory, there's getting to be a consensus that 1TB > > 7200RPM 3.5" Sata drives are really going to be the last usable capacity. > > I dunno. The 'forces' and issues you describe are real, but 'usable' > depends very heavily on the user's requirements. >
Well.... The problem is (and this isn't just a ZFS issue) that resilver and scrub times /are/ very bad for >1TB disks. This goes directly to the problem of redundancy - if you don't really care about resilver/scrub issues, then you really shouldn't bother to use Raidz or mirroring. It's pretty much in the same ballpark. That is, >1TB 3.5" drives have such long resilver/scrub times that with ZFS, it's a good bet you can kill a second (or third) drive before you can scrub or resilver in time to compensate for the already-failed one. Put it another way, you get more errors before you have time to fix the old ones, which effectively means you now can't fix errors before they become permanent. Permanent errors = data loss. > For example, a large amount of the extra space available on a larger > drive may be very rarely accessed in normal use (scrubs and resilvers > aside). In the OP's example of an ever-expanding home media > collection, much of it will never or very rarely get > re-watched. Another common use for the extra space is simply storing > more historical snapshots, against the unlikely future need to access > them. For such data, speed is really not a concern at all. > Yes, it is. It's still a concern, and not just in the scrub/resilver arena. Big drives have considerably lower performance, to the point where that replacing 1TB drives with 2TB drives may very well drop them below the threshold where they start to see stutter. That is, while the setup may work with 1TB drives, it won't with 2TB drives. It's not a no-brainer to just upgrade the size. For example, the 2TB 5900RPM 3.5" drives are (on average) over 2x as slow as the 1TB 7200RPM 3.5" drives for most operations. Access time is slower by 40%, and throughput is slower on by 30-50%. > For the subset of users for whom these forces are not overwhelming for > real usage, that leaves scrubs and resilvers. There is room for > improvement in zfs here, too - a more sequential streaming access > pattern would help. > While ZFS certainly has problems with randomly written small-data pools, scrubs and silvers on large streaming writes (like the media server) is rather straightforward. Note that RAID-6 and many RAID-5/3 hardware setups have similar issues. In any case, resilver/scrub times are becoming the dominant factor in reliability of these large drives. > To me, the biggest issue you left unmentioned is the problem of > backup. There's little option for backing up these larger drives, > other than more of the same drives. In turn, lots of the use such > drives will be put to, is for backing up other data stores, and there > again, the usage pattern fits the above profile well. > > Another usage pattern we may see more of, and that helps address some > of the performance issues, is this. Say I currently have 2 pools of > 1TB disks, one as a backup for the other. I want to expand the > space. I replace all the disks with 2TB units, but I also change my > data distribution as it grows: now, each pool is to be at most > half-full of data, and the other half is used as a backup of the > opposite pool. ZFS send is fast enough that the backup windows are > short, and I now have effectively twice as many spindles in active > service. > Don't count on 'zfs send' being fast enough. Even for liberal values of "fast enough" - it's highly data dependent. For the situation you describe, you're actually making it worse - now, both pools have a backup I/O load which reduces their available throughput. If you're talking about a pool that's already 50% slower than one made of 1TB drives, then, well, you're hosed. > > [..] it looks like hard drives are really at the end of their > > advancement, as far as capacities per drive go. > > The challenges are undeniable, but that's way too big a call. Those > are words you will regret in future; at least, I hope the future will > be one in which those words are regrettable. :-) > Honestly, from what I've seen and heard both here and on other forums, the writing is on the wall, the fat lady has sung, and Mighty Casey has struck out. The 3.5" winchester hard drive is on terminal life support for use in enterprises. It will linger a little longer in commodity places, where its cost/GB overcomes its weaknesses. 2.5" HDs will last out the decade, as they're slightly higher performance/GB and space/power savings will allow them to hold off solid-state media for a bit. But solid-state is the future, and a very near future it is. > > >1TB drives currently have excessively long resilver time, inferior > > reliability (for the most part), and increased power consumption. > > Yes, for the most part. However, a 2TB drive has dramatically less > power consumption than 2x1TB drives (and less of other valuable > resources, like bays and controller slots). > GB/Watt, yes. Performance/watt, no. And you chew up additional bays/slots/etc. trying to get back the performance with larger drives. > > I'd generally recommend that folks NOT step beyond the 1TB capacity > > at the 3.5" hard drive format. > > A general recommendation is fine, and this is one I agree with for > many scenarios. At least, I'd recommend that folks look more closely > at alternatives using 2.5" drives and sas expander bays than they > might otherwise. > > > So, while it's nice that you can indeed seemlessly swap up drives sizes > > (and your recommendation of using 2x7 helps that process), in reality, > > it's not a good idea to upgrade from his existing 1TB drives. > > So what does he do instead, when he's running out of space and 1TB > drives are hard to come by? The advice still stands, as far as I'm > concerned: do something now, that will leave you room for different > expansion choices later - and evaluate the best expansion choice > later, when the parameters of the time are known. > -- > Dan. I echo what Bob said earlier: don't plan on being able to upgrade these disks in-place. Plan for expanding the setup, but you won't be able to upgrade the 1TB disks to large capacities. At least not for the better part of this decade, until you can replace them with solid-state drives of some sort. As a practical matter, small setups are for the most part not expandable/upgradable much, if at all. Buy what you need now, and plan on rebuying something new in 5-10 years, but don't think that what you put together now can be continuously upgraded for a decade. You make too many tradeoffs in the initial design to allow for that kind of upgradability. Even enterprise stuff these days is pretty much "dispose and replace", not "upgrade". -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss