Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-30 Thread Bob Friesenhahn
On Mon, 30 Nov 2009, Carsten Aulbert wrote: after the disk was exchanged, I ran 'zpool clear' and another zpoo scrub afterwards... and guess what, now another vdev shows similar problems: Ugh! Now, the big question is, what could be faulty. fmadm only shows vdev checksum problems, right now

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-30 Thread Carsten Aulbert
Hi all, after the disk was exchanged, I ran 'zpool clear' and another zpoo scrub afterwards... and guess what, now another vdev shows similar problems: s13:~# zpool status pool: atlashome state: DEGRADED

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-27 Thread Carsten Aulbert
Hi Ross, On Friday 27 November 2009 21:31:52 Ross Walker wrote: > I would plan downtime to physically inspect the cabling. There is not much cabling as the disks are directly connected to a large backplane (Sun Fire X4500) Cheers Carsten ___ zfs-

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-27 Thread Ross Walker
On Nov 27, 2009, at 12:55 PM, Carsten Aulbert > wrote: On Friday 27 November 2009 18:45:36 Carsten Aulbert wrote: I was too fast, now it looks completely different: scrub: resilver completed after 4h3m with 0 errors on Fri Nov 27 18:46:33 2009 [...] s13:~# zpool status pool: atlashome state

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-27 Thread Bob Friesenhahn
On Fri, 27 Nov 2009, Carsten Aulbert wrote: Now the big question: (1) zpool clear or (2) bring in the spare again (or exchange two more disks)? Opinions? Since "applications are unaffected" (good sign!), I would save all notes regarding current status, do 'zpool clear', 'zpool scrub' and t

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-27 Thread Carsten Aulbert
On Friday 27 November 2009 18:45:36 Carsten Aulbert wrote: I was too fast, now it looks completely different: scrub: resilver completed after 4h3m with 0 errors on Fri Nov 27 18:46:33 2009 [...] s13:~# zpool status pool: atlashome state: DEGRADED status: One or

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-27 Thread Carsten Aulbert
Hi Bob On Friday 27 November 2009 17:19:22 Bob Friesenhahn wrote: > > It is interesting that in addition to being in the same vdev, the > disks encountering serious problems are all target 6. Besides > something at the zfs level, there could be some some issue at the > device driver, or underlyi

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-27 Thread Bob Friesenhahn
On Fri, 27 Nov 2009, Carsten Aulbert wrote: At the very least, I would consider physically replacing c1t6d0. That's an option and see if I can let the system repair more of the errors. Regarding the error with a named disk, there is only one disk named in the output so far. Definitely repla

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-27 Thread Carsten Aulbert
Hi all, On Thursday 26 November 2009 17:38:42 Cindy Swearingen wrote: > Did anything about this configuration change before the checksum errors > occurred? > No, This machine is running in this configuration for a couple of weeks now > The errors on c1t6d0 are severe enough that your spare kick

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-26 Thread Cindy Swearingen
> Hi all, > > on a x4500 with a relatively well patched Sol10u8 > > # uname -a > SunOS s13 5.10 Generic_141445-09 i86pc i386 i86pc > > I've started a scrub after about 2 weeks of operation > and have a lot of > checksum errors: > > s13:~# zpool status >

Re: [zfs-discuss] Help needed to find out where the problem is

2009-11-26 Thread Richard Elling
On Nov 26, 2009, at 2:35 AM, Carsten Aulbert wrote: Hi all, on a x4500 with a relatively well patched Sol10u8 # uname -a SunOS s13 5.10 Generic_141445-09 i86pc i386 i86pc I've started a scrub after about 2 weeks of operation and have a lot of checksum errors: s13:~# zpool status pool: a

[zfs-discuss] Help needed to find out where the problem is

2009-11-26 Thread Carsten Aulbert
Hi all, on a x4500 with a relatively well patched Sol10u8 # uname -a SunOS s13 5.10 Generic_141445-09 i86pc i386 i86pc I've started a scrub after about 2 weeks of operation and have a lot of checksum errors: s13:~# zpool status pool: atlashome