On Fri, 2010-04-09 at 08:07 +1000, Daniel Carosone wrote:
> On Thu, Apr 08, 2010 at 12:14:55AM -0700, Erik Trimble wrote:
> > Daniel Carosone wrote:
> >> Go with the 2x7 raidz2.  When you start to really run out of space,
> >> replace the drives with bigger ones.
> >
> > While that's great in theory, there's getting to be a consensus that 1TB  
> > 7200RPM 3.5" Sata drives are really going to be the last usable capacity. 
> 
> I dunno.  The 'forces' and issues you describe are real, but 'usable'
> depends very heavily on the user's requirements.  
> 

Well....

The problem is (and this isn't just a ZFS issue) that resilver and scrub
times /are/ very bad for >1TB disks.  This goes directly to the problem
of redundancy - if you don't really care about resilver/scrub issues,
then you really shouldn't bother to use Raidz or mirroring.  It's pretty
much in the same ballpark.


That is, >1TB 3.5" drives have such long resilver/scrub times that with
ZFS, it's a good bet you can kill a second (or third) drive before you
can scrub or resilver in time to compensate for the already-failed one.
Put it another way, you get more errors before you have time to fix the
old ones, which effectively means you now can't fix errors before they
become permanent. Permanent errors = data loss.




> For example, a large amount of the extra space available on a larger
> drive may be very rarely accessed in normal use (scrubs and resilvers
> aside).  In the OP's example of an ever-expanding home media
> collection, much of it will never or very rarely get
> re-watched. Another common use for the extra space is simply storing 
> more historical snapshots, against the unlikely future need to access
> them.  For such data, speed is really not a concern at all.
> 

Yes, it is. It's still a concern, and not just in the scrub/resilver
arena. Big drives have considerably lower performance, to the point
where that replacing 1TB drives with 2TB drives may very well drop them
below the threshold where they start to see stutter.  That is, while the
setup may work with 1TB drives, it won't with 2TB drives.  It's not a
no-brainer to just upgrade the size.

For example, the 2TB 5900RPM 3.5" drives are (on average) over 2x as
slow as the 1TB 7200RPM 3.5" drives for most operations. Access time is
slower by 40%, and throughput is slower on by 30-50%.


> For the subset of users for whom these forces are not overwhelming for
> real usage, that leaves scrubs and resilvers.  There is room for
> improvement in zfs here, too - a more sequential streaming access
> pattern would help.
> 

While ZFS certainly has problems with randomly written small-data pools,
scrubs and silvers on large streaming writes (like the media server) is
rather straightforward. Note that RAID-6 and many RAID-5/3 hardware
setups have similar issues.

In any case, resilver/scrub times are becoming the dominant factor in
reliability of these large drives.


> To me, the biggest issue you left unmentioned is the problem of
> backup.  There's little option for backing up these larger drives,
> other than more of the same drives.  In turn, lots of the use such
> drives will be put to, is for backing up other data stores, and there
> again, the usage pattern fits the above profile well.
> 
> Another usage pattern we may see more of, and that helps address some
> of the performance issues, is this.  Say I currently have 2 pools of
> 1TB disks, one as a backup for the other.  I want to expand the
> space.  I replace all the disks with 2TB units, but I also change my
> data distribution as it grows: now, each pool is to be at most
> half-full of data, and the other half is used as a backup of the
> opposite pool.  ZFS send is fast enough that the backup windows are
> short, and I now have effectively twice as many spindles in active
> service. 
> 

Don't count on 'zfs send' being fast enough. Even for liberal values of
"fast enough" - it's highly data dependent.  For the situation you
describe, you're actually making it worse - now, both pools have a
backup I/O load which reduces their available throughput. If you're
talking about a pool that's already 50% slower than one made of 1TB
drives, then, well, you're hosed.


> > [..] it looks like hard drives are really at the end of their
> > advancement, as far as capacities per drive go.
> 
> The challenges are undeniable, but that's way too big a call.  Those
> are words you will regret in future; at least, I hope the future will
> be one in which those words are regrettable. :-)
> 

Honestly, from what I've seen and heard both here and on other forums,
the writing is on the wall, the fat lady has sung, and Mighty Casey has
struck out.  The 3.5" winchester hard drive is on terminal life support
for use in enterprises. It will linger a little longer in commodity
places, where its cost/GB overcomes its weaknesses.  2.5" HDs will last
out the decade, as they're slightly higher performance/GB and
space/power savings will allow them to hold off solid-state media for a
bit.  But solid-state is the future, and a very near future it is.



> > >1TB drives currently have excessively long resilver time, inferior  
> > reliability (for the most part), and increased power consumption.
> 
> Yes, for the most part.  However, a 2TB drive has dramatically less
> power consumption than 2x1TB drives (and less of other valuable
> resources, like bays and controller slots). 
> 

GB/Watt, yes. Performance/watt, no. And you chew up additional
bays/slots/etc. trying to get back the performance with larger drives.


> > I'd generally recommend that folks NOT step beyond the 1TB capacity
> > at the 3.5" hard drive format.
> 
> A general recommendation is fine, and this is one I agree with for
> many scenarios.  At least, I'd recommend that folks look more closely
> at alternatives using 2.5" drives and sas expander bays than they
> might otherwise.
> 
> > So, while it's nice that you can indeed seemlessly swap up drives sizes  
> > (and your recommendation of using 2x7 helps that process), in reality,  
> > it's not a good idea to upgrade from his existing 1TB drives.
> 
> So what does he do instead, when he's running out of space and 1TB
> drives are hard to come by?   The advice still stands, as far as I'm
> concerned: do something now, that will leave you room for different
> expansion choices later - and evaluate the best expansion choice
> later, when the parameters of the time are known.
> --
> Dan.

I echo what Bob said earlier: don't plan on being able to upgrade these
disks in-place.  Plan for expanding the setup, but you won't be able to
upgrade the 1TB disks to large capacities. At least not for the better
part of this decade, until you can replace them with solid-state drives
of some sort.

As a practical matter, small setups are for the most part not
expandable/upgradable much, if at all. Buy what you need now, and plan
on rebuying something new in 5-10 years, but don't think that what you
put together now can be continuously upgraded for a decade. You make too
many tradeoffs in the initial design to allow for that kind of
upgradability.  Even enterprise stuff these days is pretty much "dispose
and replace", not "upgrade". 






-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to