Re: [zfs-discuss] Narrow escape with FAULTED disks

2010-08-23 Thread Mark Bennett
Well I do have a plan. Thanks to the portability of ZFS boot disks, I'll make two new OS disks on another machine with the next Nexcenta release, export the data pool and swap in the new ones. That way, I can at least manage a zfs scrub without killing the performance and get the Intel SSD's I

Re: [zfs-discuss] Narrow escape with FAULTED disks

2010-08-18 Thread Cindy Swearingen
Its hard to tell what caused the smart predictive failure message, like a temp fluctuation. If ZFS noticed that a disk wasn't available yet, then I would expect a message to that effect. In any case, I think I would have a replacement disk available. The important thing is that you continue to m

Re: [zfs-discuss] Narrow escape with FAULTED disks

2010-08-17 Thread Cindy Swearingen
Hi Mark, I would recheck with fmdump to see if you have any persistent errors on the second disk. The fmdump command will display faults and fmdump -eV will display errors (persistent faults that have turned into errors based on some criteria). If fmdump -eV doesn't show any activity for that

[zfs-discuss] Narrow escape with FAULTED disks

2010-08-16 Thread Mark Bennett
Nothing like a "heart in mouth moment" to shave tears from your life. I rebooted a snv_132 box in perfect heath, and it came back up with two FAULTED disks in the same vdisk group. Everything an hour on Google I found basically said "your data is gone". All 45Tb of it. A postmortem of fmadm sh

Re: [zfs-discuss] Narrow escape!

2009-06-24 Thread Ross
Ok, this is getting weird. I just ran a zpool clear, and now it says: # zpool clear zfspool # zpool status pool: zfspool state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool u

Re: [zfs-discuss] Narrow escape!

2009-06-24 Thread Ross
Thanks Mark, it looks like that was good advice. It also appears that as suggested, it's not the drive that's faulty... anybody have any thoughts as to how I find what's actually the problem? # zpool status pool: zfspool state: DEGRADED status: One or more devices has experienced an unrecove

Re: [zfs-discuss] Narrow escape!

2009-06-23 Thread Haudy Kazemi
"scrub: resilver completed after 5h50m with 0 errors on Tue Jun 23 05:04:18 2009" Zero errors even though other parts of the message definitely show errors? This is described here: http://docs.sun.com/app/docs/doc/819-5461/gbcve?a=view Device errors do not guarantee pool errors when redundancy

Re: [zfs-discuss] Narrow escape!

2009-06-23 Thread Mark J Musante
On Mon, 22 Jun 2009, Ross wrote: All seemed well, I replaced the faulty drive, imported the pool again, and kicked off the repair with: # zpool replace zfspool c1t1d0 What build are you running? Between builds 105 and 113 inclusive there's a bug in the resilver code which causes it to miss

Re: [zfs-discuss] Narrow escape!

2009-06-23 Thread Fajar A. Nugraha
On Tue, Jun 23, 2009 at 1:13 PM, Ross wrote: > Look at how the resilver finished: > >            c1t3d0  ONLINE       3     0     0  128K resilvered >            c1t4d0  ONLINE       0     0    11  473K resilvered >            c1t5d0  ONLINE       0     0    23  986K resilvered Comparing from your

Re: [zfs-discuss] Narrow escape!

2009-06-22 Thread Ross
To be honest, never. It's a cheap server sat at home, and I never got around to writing a script to scrub it and report errors. I'm going to write one now though! Look at how the resilver finished: # zpool status pool: zfspool state: ONLINE status: One or more devices has experienced an unr

Re: [zfs-discuss] Narrow escape!

2009-06-22 Thread Bob Friesenhahn
On Mon, 22 Jun 2009, Ed Spencer wrote: I'm curious, how often do you scrub the pool? Once a week for me. Early every Monday morning so that if something goes wrong, it is at the start of the week. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfrie

Re: [zfs-discuss] Narrow escape!

2009-06-22 Thread Ed Spencer
I'm curious, how often do you scrub the pool? On Mon, 2009-06-22 at 15:33, Ross wrote: > Hey folks, > > Well, I've had a disk fail in my home server, so I've had my first experience > of hunting down the faulty drive and replacing it (damn site easier on Sun > kit than on a home built box I can

Re: [zfs-discuss] Narrow escape!

2009-06-22 Thread Simon Breden
Lucky one there Ross! Makes me glad I also upgraded to RAID-Z2 ;-) Simon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Narrow escape!

2009-06-22 Thread Ross
Hey folks, Well, I've had a disk fail in my home server, so I've had my first experience of hunting down the faulty drive and replacing it (damn site easier on Sun kit than on a home built box I can tell you!). All seemed well, I replaced the faulty drive, imported the pool again, and kicked o