Re: [zfs-discuss] zpool scrub on b123

2011-05-12 Thread Richard Elling
On May 12, 2011, at 1:53 PM, Karl Rossing wrote: > I have an outage tonight and would like to swap out the LSI 3801 for an LSI > 9200 > > Should I zpool export before the swaping the card? A clean shutdown is sufficient. You might need to "devfsadm -c disk" to build the device tree. -- richa

Re: [zfs-discuss] zpool scrub on b123

2011-05-12 Thread Karl Rossing
I have an outage tonight and would like to swap out the LSI 3801 for an LSI 9200 Should I zpool export before the swaping the card? On 04/16/2011 10:45 AM, Roy Sigurd Karlsbakk wrote: I'm going to wait until the scrub is complete before diving in some more. I'm wondering if replacing the LSI

Re: [zfs-discuss] zpool scrub on b123

2011-04-18 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Karl Rossing > > So i figured out after a couple of scrubs and fmadm faulty that drive > c9t15d0 was bad. > > My pool now looks like this: > NAME STATE READ WRITE CKSUM

Re: [zfs-discuss] zpool scrub on b123

2011-04-18 Thread Roy Sigurd Karlsbakk
> I'm going to replace c9t15d0 with a new drive. > > I find it odd that zfs needed to resilver the drive after the reboot. > Shouldn't the resilvered information be kept across reboots? the iostat data, as returned from iostat -en, are not kept over a reboot. I don't know if it's possible to kee

Re: [zfs-discuss] zpool scrub on b123

2011-04-18 Thread Karl Rossing
So i figured out after a couple of scrubs and fmadm faulty that drive c9t15d0 was bad. I then replaced the drive using -bash-3.2$ pfexec /usr/sbin/zpool offline vdipool c9t15d0 -bash-3.2$ pfexec /usr/sbin/zpool replace vdipool c9t15d0 c9t19d0 The drive resilvered and I rebooted the server, j

Re: [zfs-discuss] zpool scrub on b123

2011-04-16 Thread Roy Sigurd Karlsbakk
> I'm going to wait until the scrub is complete before diving in some > more. > > I'm wondering if replacing the LSI SAS 3801E with an LSI SAS 9200-8e > might help too. I've seen similar errors with 3801 - seems to be SAS timeouts. Reboot the box and it'll probably work well again for a while. I

Re: [zfs-discuss] zpool scrub on b123

2011-04-15 Thread Nathan Kroenert
Hi Karl, Is there any chance at all that some other system is writing to the drives in this pool? You say other things are writing to the same JBOD... Given that the amount flagged as corrupt is so small, I'd imagine not, but thought I'd ask the question anyways. Cheers! Nathan. On 04/16

Re: [zfs-discuss] zpool scrub on b123

2011-04-15 Thread Cindy Swearingen
Yes, the Solaris 10 9/10 release has the fix for RAIDZ checksum errors if you have ruled out any hardware problems. cs On 04/15/11 14:47, Karl Rossing wrote: Would moving the pool to a Solaris 10U9 server fix the random RAIDZ errors? On 04/15/2011 02:23 PM, Cindy Swearingen wrote: D'oh. One mo

Re: [zfs-discuss] zpool scrub on b123

2011-04-15 Thread Karl Rossing
Would moving the pool to a Solaris 10U9 server fix the random RAIDZ errors? On 04/15/2011 02:23 PM, Cindy Swearingen wrote: D'oh. One more thing. We had a problem in b120-123 that caused random checksum errors on RAIDZ configs. This info is still in the ZFS troubleshooting guide. See if a zp

Re: [zfs-discuss] zpool scrub on b123

2011-04-15 Thread Karl Rossing
I'm going to wait until the scrub is complete before diving in some more. I'm wondering if replacing the LSI SAS 3801E with an LSI SAS 9200-8e might help too. Karl On 04/15/2011 02:23 PM, Cindy Swearingen wrote: D'oh. One more thing. We had a problem in b120-123 that caused random checksum

Re: [zfs-discuss] zpool scrub on b123

2011-04-15 Thread Cindy Swearingen
D'oh. One more thing. We had a problem in b120-123 that caused random checksum errors on RAIDZ configs. This info is still in the ZFS troubleshooting guide. See if a zpool clear resolves these errors. If that works, then I would upgrade to a more recent build and see if the problem is resolved

Re: [zfs-discuss] zpool scrub on b123

2011-04-15 Thread Cindy Swearingen
Hi Karl... I just saw this same condition on another list. I think the poster resolved it by replacing the HBA. Drives go bad but they generally don't all go bad at once, so I would suspect some common denominator like the HBA/controller, cables, and so on. See what FMA thinks by running fmdump

[zfs-discuss] zpool scrub on b123

2011-04-15 Thread Karl Rossing
Hi, One of our zfs volumes seems to be having some errors. So I ran zpool scrub and it's currently showing the following. -bash-3.2$ pfexec /usr/sbin/zpool status -x pool: vdipool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made