Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-14 Thread Paul Armstrong
> Paul Kraus wrote: > > In the ZFS case I could replace the disk > and the zpool would > > resilver automatically. I could also take the > removed disk and put it > > into the second system and have it recognize the > zpool (and that it > > was missing half of a mirror) and the data was all

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Marion Hakanson
> . . . >> Use JBODs. Or tell the cache controllers to ignore >> the flushing requests. [EMAIL PROTECTED] said: > Unfortunately HP EVA can't do it. About the 9900V, it is really fast (64GB > cache helps a lot) end reliable. 100% uptime in years. We'll never touch it > to solve a ZFS problem. On o

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Gino
> It seems that maybe there is too large a code path > leading to panics -- > maybe a side effect of ZFS being "new" (compared to > other filesystems). I > would hope that as these panic issues are coming up > that the code path > leading to the panic is evaluated for a specific fix > or behavior

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Wade . Stuart
[EMAIL PROTECTED] wrote on 09/12/2007 08:04:33 AM: > > Gino wrote: > > > The real problem is that ZFS should stop to force > > kernel panics. > > > > > I found these panics very annoying, too. And even > > more that the zpool > > was faulted afterwards. But my problem is that when > > someone ask

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Gino
> Gino wrote: > > The real problem is that ZFS should stop to force > kernel panics. > > > I found these panics very annoying, too. And even > more that the zpool > was faulted afterwards. But my problem is that when > someone asks me what > ZFS should do instead, I have no idea. well, what a

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Ralf Ramge
Gino wrote: > The real problem is that ZFS should stop to force kernel panics. > > I found these panics very annoying, too. And even more that the zpool was faulted afterwards. But my problem is that when someone asks me what ZFS should do instead, I have no idea. >> I have large Sybase datab

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Gino
> > -We had tons of kernel panics because of ZFS. > > Here a "reboot" must be planned with a couple of > weeks in advance > > and done only at saturday night .. > > > Well, I'm sorry, but if your datacenter runs into > problems when a single > server isn't available, you probably have much wors

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Gino
> We have seen just the opposite... we have a > server with about > 0 million files and only 4 TB of data. We have been > benchmarking FSes > for creation and manipulation of large populations of > small files and > ZFS is the only one we have found that continues to > scale linearly > above one m

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Gino
> Yes, this is a case where the disk has not completely > failed. > ZFS seems to handle the completely failed disk case > properly, and > has for a long time. Cutting the power (which you > can also do with > luxadm) makes the disk appear completely failed. Richard, I think you're right. The fail

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Gino
> On Tue, 2007-09-11 at 13:43 -0700, Gino wrote: > > -ZFS+FC JBOD: failed hard disk need a reboot > :( > > (frankly unbelievable in 2007!) > > So, I've been using ZFS with some creaky old FC JBODs > (A5200's) and old > disks which have been failing regularly and haven't > seen that; the w

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-12 Thread Ralf Ramge
Gino wrote: [...] > Just a few examples: > -We lost several zpool with S10U3 because of "spacemap" bug, > and -nothing- was recoverable. No fsck here :( > > Yes, I criticized the lack of zpool recovery mechanisms, too, during my AVS testing. But I don't have the know-how to judge if it has t

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-11 Thread Paul Kraus
On 9/11/07, Gino <[EMAIL PROTECTED]> wrote: > -ZFS performs badly with a lot of small files. > (about 20 times slower that UFS with our millions file rsync procedures) We have seen just the opposite... we have a server with about 40 million files and only 4 TB of data. We have been benchm

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-11 Thread Richard Elling
Bill Sommerfeld wrote: > On Tue, 2007-09-11 at 13:43 -0700, Gino wrote: >> -ZFS+FC JBOD: failed hard disk need a reboot :( >> (frankly unbelievable in 2007!) > > So, I've been using ZFS with some creaky old FC JBODs (A5200's) and old > disks which have been failing regularly and haven't s

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-11 Thread Bill Sommerfeld
On Tue, 2007-09-11 at 13:43 -0700, Gino wrote: > -ZFS+FC JBOD: failed hard disk need a reboot :( > (frankly unbelievable in 2007!) So, I've been using ZFS with some creaky old FC JBODs (A5200's) and old disks which have been failing regularly and haven't seen that; the worst I've seen run

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-11 Thread Gino
> To put this in perspective, no system on the planet > today handles all faults. > I would even argue that building such a system is > theoretically impossible. no doubt about that ;) > So the subset of faults which ZFS covers which is > different than the subset > that UFS covers and different

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-10 Thread Richard Elling
Gino wrote: >>> Richard, thank you for your detailed reply. >>> Unfortunately an other reason to stay with UFS in >> production .. >>> >> IMHO, maturity is the primary reason to stick with >> UFS. To look at >> this through the maturity lens, UFS is the great >> grandfather living on >> life su

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-09 Thread Gino
> > > > Richard, thank you for your detailed reply. > > Unfortunately an other reason to stay with UFS in > production .. > > > > > IMHO, maturity is the primary reason to stick with > UFS. To look at > this through the maturity lens, UFS is the great > grandfather living on > life support (p

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-08 Thread Richard Elling
Gino wrote: "cfgadm -al" or "devfsadm -C" didn't solve the >> problem. >> After a reboot ZFS recognized the drive as failed >> and all worked well. >> Do we need to restart Solaris after a drive >> failure?? >> >> It depends... >>

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-08 Thread Gino
> >> "cfgadm -al" or "devfsadm -C" didn't solve the > problem. > >> After a reboot ZFS recognized the drive as failed > and all worked well. > >> > >> Do we need to restart Solaris after a drive > failure?? > > It depends... > ... on which version of Solaris you are running. ZFS > FMA phase 2 wa

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-05 Thread Richard Elling
Paul Kraus wrote: > On 9/4/07, Gino <[EMAIL PROTECTED]> wrote: > >> yesterday we had a drive failure on a fc-al jbod with 14 drives. >> Suddenly the zpool using that jbod stopped to respond to I/O requests >> and we get tons of the following messages on /var/adm/messages: > > > >> "cfgadm -al"

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-05 Thread Paul Kraus
On 9/4/07, Gino <[EMAIL PROTECTED]> wrote: > yesterday we had a drive failure on a fc-al jbod with 14 drives. > Suddenly the zpool using that jbod stopped to respond to I/O requests > and we get tons of the following messages on /var/adm/messages: > "cfgadm -al" or "devfsadm -C" didn't solve th

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-04 Thread Gino
Hi Mark, the drive (147GB, FC 2Gb) failed on a Xyratex JBOD. Also in the past we had the same problem with a drive failed on a EMC CX JBOD. Anyway I can't understand why rebooting Solaris solved out the situation .. Thank you, Gino This message posted from opensolaris.org __

Re: [zfs-discuss] I/O freeze after a disk failure

2007-09-04 Thread Mark Ashley
I'm going to go out on a limb here and say you have an A5000 with the 1.6" disks in it. Because of their design, (all drives seeing each other on both the A and B loops), it's possible for one disk that is behaving badly to take over the FC-AL loop and require human intervention. You can physic

[zfs-discuss] I/O freeze after a disk failure

2007-09-04 Thread Gino
Hi all, yesterday we had a drive failure on a fc-al jbod with 14 drives. Suddenly the zpool using that jbod stopped to respond to I/O requests and we get tons of the following messages on /var/adm/messages: Sep 3 15:20:10 fb2 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/[EMAIL PROTECTED]