Re: [zfs-discuss] Raidz vdev size... again.

2009-04-29 Thread Bob Friesenhahn
On Tue, 28 Apr 2009, Richard Elling wrote: I suppose if you could freeze the media to 0K, then it would not decay. But that isn't the world I live in :-). There is a whole Journal devoted to things magnetic, with lots of studies of interesting compounds. But from a practical perspective, it is

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Richard Elling
Bob Friesenhahn wrote: On Tue, 28 Apr 2009, Richard Elling wrote: Yes and there is a very important point here. There are 2 different sorts of scrubbing: read and rewrite. ZFS (today) does read scrubbing, which does not reset the decay process. Some RAID arrays also do rewrite scrubs which does

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Tim
On Tue, Apr 28, 2009 at 11:12 PM, Bob Friesenhahn < bfrie...@simple.dallas.tx.us> wrote: > On Tue, 28 Apr 2009, Tim wrote: > > I'll stick with the 3 year life cycle of the system followed by a hot >> migration to new storage, thank you very much. >> > > Once again there is a fixation on the idea

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Bob Friesenhahn
On Tue, 28 Apr 2009, Tim wrote: I'll stick with the 3 year life cycle of the system followed by a hot migration to new storage, thank you very much. Once again there is a fixation on the idea that computers gradually degrade over time and that simply replacing the hardware before the expirat

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Bob Friesenhahn
On Tue, 28 Apr 2009, Richard Elling wrote: Yes and there is a very important point here. There are 2 different sorts of scrubbing: read and rewrite. ZFS (today) does read scrubbing, which does not reset the decay process. Some RAID arrays also do rewrite scrubs which does reset the decay process

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Bob Friesenhahn
On Tue, 28 Apr 2009, Miles Nordin wrote: * it'd be harmful to do this on SSD's. it might also be a really good idea to do it on SSD's. who knows yet. SSDs can be based on many types of technologies, and not just those that wear out. * it may be wasteful to do read/rewrite on an ordin

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Tim
On Tue, Apr 28, 2009 at 4:52 PM, Richard Elling wrote: > > Well done! Of course Hitachi doesn't use consumer-grade disks in > their arrays... > > I'll also confess that I did set a bit of a math trap here :-) The trap is > that if you ever have to recover data from tape/backup, then you'll > hav

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread David Magda
On Apr 28, 2009, at 18:02, Richard Elling wrote: Kees Nuyt wrote: Some high availablility storage systems overcome this decay by not just reading, but also writing all blocks during a scrub. In those systems, scrubbing is done semi-continously in the background, not on user/admin demand.

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Richard Elling
Kees Nuyt wrote: On Mon, 27 Apr 2009 18:25:42 -0700, Richard Elling wrote: The concern with large drives is unrecoverable reads during resilvering. One contributor to this is superparamagnetic decay, where the bits are lost over time as the medium tries to revert to a more steady state. To

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Richard Elling
Tim wrote: On Mon, Apr 27, 2009 at 8:25 PM, Richard Elling mailto:richard.ell...@gmail.com>> wrote: I do not believe you can achieve five 9s with current consumer disk drives for an extended period, say >1 year. Just to pipe up, while very few vendors can pull it off, we've seen

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Miles Nordin
> "kn" == Kees Nuyt writes: kn> Some high availablility storage systems overcome this decay by kn> not just reading, but also writing all blocks during a kn> scrub. sounds like a good idea but harder in the ZFS model where the software isn't the proprietary work of the only perm

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Kees Nuyt
On Mon, 27 Apr 2009 18:25:42 -0700, Richard Elling wrote: >The concern with large drives is unrecoverable reads during resilvering. >One contributor to this is superparamagnetic decay, where the bits are >lost over time as the medium tries to revert to a more steady state. >To some extent, period

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Fajar A. Nugraha
On Tue, Apr 28, 2009 at 9:42 AM, Scott Lawson wrote: >> Mainstream Solaris 10 gets a port of ZFS from OpenSolaris, so its >> features are fewer and later.  As time ticks away, fewer features >> will be back-ported to Solaris 10.  Meanwhile, you can get a production >> support  agreement for OpenSo

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Blake
On Tue, Apr 28, 2009 at 10:08 AM, Tim wrote: > > > On Mon, Apr 27, 2009 at 8:25 PM, Richard Elling > wrote: >> >> I do not believe you can achieve five 9s with current consumer disk >> drives for an extended period, say >1 year. > > Just to pipe up, while very few vendors can pull it off, we've s

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-28 Thread Tim
On Mon, Apr 27, 2009 at 8:25 PM, Richard Elling wrote: > > I do not believe you can achieve five 9s with current consumer disk > drives for an extended period, say >1 year. > Just to pipe up, while very few vendors can pull it off, we've seen five 9's with Hitachi gear using SATA. --Tim

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Scott Lawson
Richard Elling wrote: Some history below... Scott Lawson wrote: Michael Shadle wrote: On Mon, Apr 27, 2009 at 4:51 PM, Scott Lawson wrote: If possible though you would be best to let the 3ware controller expose the 16 disks as a JBOD to ZFS and create a RAIDZ2 within Solaris as you

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Bob Friesenhahn
On Mon, 27 Apr 2009, Michael Shadle wrote: I was still operating under the impression that vdevs larger than 7-8 disks typically make baby Jesus nervous. Baby Jesus might not be particularly nervous but if your drives don't perform consistently, then there will be more chance of performance

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Richard Elling
Some history below... Scott Lawson wrote: Michael Shadle wrote: On Mon, Apr 27, 2009 at 4:51 PM, Scott Lawson wrote: If possible though you would be best to let the 3ware controller expose the 16 disks as a JBOD to ZFS and create a RAIDZ2 within Solaris as you will then gain the full b

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Scott Lawson
Michael Shadle wrote: On Mon, Apr 27, 2009 at 5:32 PM, Scott Lawson wrote: One thing you haven't mentioned is the drive type and size that you are planning to use as this greatly influences what people here would recommend. RAIDZ2 is built for big, slow SATA disks as reconstruction times

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Michael Shadle
On Mon, Apr 27, 2009 at 5:32 PM, Scott Lawson wrote: > One thing you haven't mentioned is the drive type and size that you are > planning to use as this > greatly influences what people here would recommend. RAIDZ2 is built for > big, slow SATA > disks as reconstruction times in large RAIDZ's and

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Scott Lawson
Michael Shadle wrote: On Mon, Apr 27, 2009 at 4:51 PM, Scott Lawson wrote: If possible though you would be best to let the 3ware controller expose the 16 disks as a JBOD to ZFS and create a RAIDZ2 within Solaris as you will then gain the full benefits of ZFS. Block self healing etc etc.

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Michael Shadle
On Mon, Apr 27, 2009 at 4:51 PM, Scott Lawson wrote: > If possible though you would be best to let the 3ware controller expose > the 16 disks as a JBOD  to ZFS and create a RAIDZ2 within Solaris as you > will then > gain the full benefits of ZFS. Block self healing etc etc. > > There isn't an iss

Re: [zfs-discuss] Raidz vdev size... again.

2009-04-27 Thread Scott Lawson
Leon, RAIDZ2 is ~equivalent to RAID6. ~2 disks of parity data. Allowing a double drive failure and still having the pool available. If possible though you would be best to let the 3ware controller expose the 16 disks as a JBOD to ZFS and create a RAIDZ2 within Solaris as you will then gain