Well obviously recovery scenarios need testing, but I still don't see it being 
that bad.  My thinking on this is:

1.  Loss of a server is very much the worst case scenario.  Disk errors are 
much more likely, and with raid-z2 pools on the individual servers this should 
not pose a problem.  I also would not expect to see disk failures downing an 
entire x4500.  Sun have sold an awful lot of these now, enough for me to feel 
any such problems should be a thing of the past.

2.  Even when a server does fail, the nature of ZFS is such that you would not 
expect to loose your data, nor should you be expecting to resilver the entire 
28TB.  A motherboard / backplane / PSU failure will offline that server, but 
once the faulted components are replaced your pool will come back online.  Once 
the pool is online, ZFS has the ability to resilver just the changed data, 
meaning that your rebuild time will be simply proportional to the time the 
server was down.

Of course these failure modes would need testing, as would rebuild times.  I 
don't see 'zfs send' performance being an issue though, not unless Grey has 
another 150TB of storage lying around that he's not telling us about.  :-)

There are always going to be some tradeoffs between risk, capacity and price, 
but I expect that the benefits of this setup far outweigh the negatives.

Ross
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to