On Fri, 08 Jan 2010 13:55:13 +0100, Darren J Moffat <darr...@opensolaris.org> wrote:
> Frank Batschulat (Home) wrote: >> This just can't be an accident, there must be some coincidence and thus >> there's a good chance >> that these CHKSUM errors must have a common source, either in ZFS or in NFS ? > > What are you using for on the wire protection with NFS ? Is it shared > using krb5i or do you have IPsec configured ? If not I'd recommend > trying one of those and see if your symptoms change. Hey Darren, doing krb5i is certainly a good idea for additional protection in general, however I have some doubts that NFS OTW corruption will produce the exact same wrong checksum inside 2 totally different setups and networks, as comparing Mike and my results showed [see 1]. cheers frankB [1] osoldev.batschul./export/home/batschul.=> fmdump -eV | grep cksum_actual | sort | uniq -c | sort -n | tail 2 cksum_actual = 0x4bea1a77300 0xf6decb1097980 0x217874c80a8d9100 0x7cd81ca72df5ccc0 2 cksum_actual = 0x5c1c805253 0x26fa7270d8d2 0xda52e2079fd74 0x3d2827dd7ee4f21 6 cksum_actual = 0x28e08467900 0x479d57f76fc80 0x53bca4db5209300 0x983ddbb8c4590e40 *A 6 cksum_actual = 0x348e6117700 0x765aa1a547b80 0xb1d6d98e59c3d00 0x89715e34fbf9cdc0 *B 7 cksum_actual = 0x0 0x0 0x0 0x0 *C 11 cksum_actual = 0x1184cb07d00 0xd2c5aab5fe80 0x69ef5922233f00 0x280934efa6d20f40 *D 14 cksum_actual = 0x175bb95fc00 0x1767673c6fe00 0xfa9df17c835400 0x7e0aef335f0c7f00 *E 17 cksum_actual = 0x2eb772bf800 0x5d8641385fc00 0x7cf15b214fea800 0xd4f1025a8e66fe00 *F 20 cksum_actual = 0xbaddcafe00 0x5dcc54647f00 0x1f82a459c2aa00 0x7f84b11b3fc7f80 *G 25 cksum_actual = 0x5d6ee57f00 0x178a70d27f80 0x3fc19c3a19500 0x82804bc6ebcfc0 ========================================================================== now compare this with Mike's error output as posted here: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg33041.html # fmdump -eV | grep cksum_actual | sort | uniq -c | sort -n | tail 2 cksum_actual = 0x14c538b06b6 0x2bb571a06ddb0 0x3e05a7c4ac90c62 0x290cbce13fc59dce *D 3 cksum_actual = 0x175bb95fc00 0x1767673c6fe00 0xfa9df17c835400 0x7e0aef335f0c7f00 *E 3 cksum_actual = 0x2eb772bf800 0x5d8641385fc00 0x7cf15b214fea800 0xd4f1025a8e66fe00 *B 4 cksum_actual = 0x0 0x0 0x0 0x0 4 cksum_actual = 0x1d32a7b7b00 0x248deaf977d80 0x1e8ea26c8a2e900 0x330107da7c4bcec0 5 cksum_actual = 0x14b8f7afe6 0x915db8d7f87 0x205dc7979ad73 0x4e0b3a8747b8a8 *C 6 cksum_actual = 0x1184cb07d00 0xd2c5aab5fe80 0x69ef5922233f00 0x280934efa6d20f40 *A 6 cksum_actual = 0x348e6117700 0x765aa1a547b80 0xb1d6d98e59c3d00 0x89715e34fbf9cdc0 *F 16 cksum_actual = 0xbaddcafe00 0x5dcc54647f00 0x1f82a459c2aa00 0x7f84b11b3fc7f80 *G 48 cksum_actual = 0x5d6ee57f00 0x178a70d27f80 0x3fc19c3a19500 0x82804bc6ebcfc0 and observe that the values in 'chksum_actual' causing our CHKSUM pool errors eventually because of missmatching with what had been expected are the SAME ! for 2 totally different client systems and 2 different NFS servers (mine vrs. Mike's), see the entries marked with *A to *G. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss