Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-21 Thread Jim Klimov
On 2013-01-21 07:06, Stephan Budach wrote: Are there switch stats on whether it has seen media errors? Has anybody gotton QLogic's SanSurfer to work with anything newer than Java 1.4.2? ;) I checked the logs on my switches and they don't seem to indicate such issues, but I am lacking the real-ti

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-20 Thread Stephan Budach
Am 21.01.13 00:21, schrieb Jim Klimov: Did you try replacing the patch-cables and/or SFPs on the path between servers and disks, or at least cleaning them? A speck of dust (or, God forbid, a pixel of body fat from a fingerprint) caught between the two optic cable cutoffs might cause any kind of s

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-20 Thread Jim Klimov
Did you try replacing the patch-cables and/or SFPs on the path between servers and disks, or at least cleaning them? A speck of dust (or, God forbid, a pixel of body fat from a fingerprint) caught between the two optic cable cutoffs might cause any kind of signal weirdness from time to time... and

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-20 Thread Jim Klimov
On 2013-01-20 16:56, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov And regarding the "considerable activity" - AFAIK there is little way for ZFS to reliably read and

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-20 Thread Stephan Budach
Am 20.01.13 16:51, schrieb Edward Ned Harvey (opensolarisisdeadlongliveopensolaris): From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Stephan Budach I am always experiencing chksum errors while scrubbing my zpool(s), but I never experienced chk

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-20 Thread Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Jim Klimov > > And regarding the "considerable activity" - AFAIK there is little way > for ZFS to reliably read and test "TXGs newer than X" My understanding is like this: When you make a sn

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-20 Thread Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Stephan Budach > > I am always experiencing chksum errors while scrubbing my zpool(s), but > I never experienced chksum errors while resilvering. Does anybody know > why that would be? When y

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Jim Klimov
On 2013-01-19 20:23, Jim Klimov wrote: On 2013-01-19 20:08, Bob Friesenhahn wrote: On Sat, 19 Jan 2013, Jim Klimov wrote: On 2013-01-19 18:17, Bob Friesenhahn wrote: Resilver may in fact be just verifying that the pool disks are coherent via metadata. This might happen if the fiber channel i

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Stephan Budach
Am 19.01.13 20:18, schrieb Bob Friesenhahn: On Sat, 19 Jan 2013, Stephan Budach wrote: Just ignore the timestamp, as it seems that the time is not set correctly, but the dates match my two issues from today and thursday, which accounts for three days. I didn't catch that before, but it seems

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Jim Klimov
On 2013-01-19 20:08, Bob Friesenhahn wrote: On Sat, 19 Jan 2013, Jim Klimov wrote: On 2013-01-19 18:17, Bob Friesenhahn wrote: Resilver may in fact be just verifying that the pool disks are coherent via metadata. This might happen if the fiber channel is flapping. Correction: that (verifica

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Bob Friesenhahn
On Sat, 19 Jan 2013, Stephan Budach wrote: Just ignore the timestamp, as it seems that the time is not set correctly, but the dates match my two issues from today and thursday, which accounts for three days. I didn't catch that before, but it seems to clearly indicate a problem with the FC co

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Bob Friesenhahn
On Sat, 19 Jan 2013, Jim Klimov wrote: On 2013-01-19 18:17, Bob Friesenhahn wrote: Resilver may in fact be just verifying that the pool disks are coherent via metadata. This might happen if the fiber channel is flapping. Correction: that (verification) would be scrubbing ;) I don't think t

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Stephan Budach
Am 19.01.13 18:17, schrieb Bob Friesenhahn: On Sat, 19 Jan 2013, Stephan Budach wrote: Now, this zpool is made of 3-way mirrors and currently 13 out of 15 vdevs are resilvering (which they had gone through yesterday as well) and I never got any error while resilvering. I have been all over th

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Jim Klimov
On 2013-01-19 18:17, Bob Friesenhahn wrote: Resilver may in fact be just verifying that the pool disks are coherent via metadata. This might happen if the fiber channel is flapping. Correction: that (verification) would be scrubbing ;) The way I get it, resilvering is related to scrubbing but

Re: [zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Bob Friesenhahn
On Sat, 19 Jan 2013, Stephan Budach wrote: Now, this zpool is made of 3-way mirrors and currently 13 out of 15 vdevs are resilvering (which they had gone through yesterday as well) and I never got any error while resilvering. I have been all over the setup to find any glitch or bad part, but

[zfs-discuss] Resilver w/o errors vs. scrub with errors

2013-01-19 Thread Stephan Budach
Hi, I am always experiencing chksum errors while scrubbing my zpool(s), but I never experienced chksum errors while resilvering. Does anybody know why that would be? This happens on all of my servers, Sun Fire 4170M2, Dell PE 650 and on any FC storage that I have. Currently I had a major iss