Hi Carsten,
This was supposed to be fixed in build 164 of Nevada (6742788). If
you are still seeing this
issue in S11, I think you should raise a bug with relevant details.
As Paul has suggested,
this could also be due to incomplete snapshot.
I have seen interrupted zfs recv's causing weired bugs.
Thanks,
Deepak.
On 03/27/12 12:44 PM, Carsten John wrote:
Hallo everybody,
I have a Solaris 11 box here (Sun X4270) that crashes with a kernel panic
during the import of a zpool (some 30TB) containing ~500 zfs filesystems after
reboot. This causes a reboot loop, until booted single user and removed
/etc/zfs/zpool.cache.
From /var/adm/messages:
savecore: [ID 570001 auth.error] reboot after panic: BAD TRAP: type=e (#pf Page fault)
rp=ffffff002f9cec50 addr=20 occurred in module "zfs" due to a NULL pointer
dereference
savecore: [ID 882351 auth.error] Saving compressed system crash dump in
/var/crash/vmdump.2
This is what mdb tells:
mdb unix.2 vmcore.2
Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp
scsi_vhci zfs mpt sd ip hook neti arp usba uhci sockfs qlc fctl s1394 kssl lofs
random fcp idm sata fcip cpc crypto ufs logindmux ptm sppp ]
$c
zap_leaf_lookup_closest+0x45(ffffff0700ca2a98, 0, 0, ffffff002f9cedb0)
fzap_cursor_retrieve+0xcd(ffffff0700ca2a98, ffffff002f9ceed0, ffffff002f9cef10)
zap_cursor_retrieve+0x195(ffffff002f9ceed0, ffffff002f9cef10)
zfs_purgedir+0x4d(ffffff0721d32c20)
zfs_rmnode+0x57(ffffff0721d32c20)
zfs_zinactive+0xb4(ffffff0721d32c20)
zfs_inactive+0x1a3(ffffff0721d3a700, ffffff07149dc1a0, 0)
fop_inactive+0xb1(ffffff0721d3a700, ffffff07149dc1a0, 0)
vn_rele+0x58(ffffff0721d3a700)
zfs_unlinked_drain+0xa7(ffffff07022dab40)
zfsvfs_setup+0xf1(ffffff07022dab40, 1)
zfs_domount+0x152(ffffff07223e3c70, ffffff0717830080)
zfs_mount+0x4e3(ffffff07223e3c70, ffffff07223e5900, ffffff002f9cfe20,
ffffff07149dc1a0)
fsop_mount+0x22(ffffff07223e3c70, ffffff07223e5900, ffffff002f9cfe20,
ffffff07149dc1a0)
domount+0xd2f(0, ffffff002f9cfe20, ffffff07223e5900, ffffff07149dc1a0,
ffffff002f9cfe18)
mount+0xc0(ffffff0713612c78, ffffff002f9cfe98)
syscall_ap+0x92()
_sys_sysenter_post_swapgs+0x149()
I can import the pool readonly.
The server is a mirror for our primary file server and is synced via zfs
send/receive.
I saw a similar effect some time ago on a opensolaris box (build 111b). That
time my final solution was to copy over the read only mounted stuff to a newly
created pool. As it is the second time this failure occures (on different
machines) I'm really concerned about overall reliability....
Any suggestions?
thx
Carsten
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss