Hi,

System: Netra 1405, 4x450Mhz, 4GB RAM and 2x146GB (root pool) and  
2x146GB (space pool). snv_98.

After a panic the system hangs on boot and manual attempts to mount  
(at least) one dataset in single user mode, hangs.

The Panic:

Dec 27 04:42:11 base ^Mpanic[cpu0]/thread=300021c1a20:
Dec 27 04:42:11 base unix: [ID 521688 kern.notice] [AFT1] errID  
0x00167f73.1c737868 UE Error(s)
Dec 27 04:42:11 base     See previous message(s) for details
Dec 27 04:42:11 base unix: [ID 100000 kern.notice]
Dec 27 04:42:11 base genunix: [ID 723222 kern.notice] 000002a10433efc0  
SUNW,UltraSPARC-II:cpu_aflt_log+5b4 (3, 2a10433f208, 2a10433f2e0, 10,  
2a10433f207, 2a10433f208)
Dec 27 04:42:11 base genunix: [ID 179002 kern.notice]   %l0-3:  
000002a10433f0cb 00000000000f0000 00000000012ccc00 00000000012cd000
Dec 27 04:42:11 base   %l4-7: 000002a10433f208 0000000000000170  
00000000012ccc00 0000000000000001
Dec 27 04:42:11 base genunix: [ID 723222 kern.notice] 000002a10433f210  
SUNW,UltraSPARC-II:cpu_async_error+cdc (7fe00000, 0, 180200000, 40, 0,  
a0b7ff60)
Dec 27 04:42:11 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000000000000000 000000000180c000 0000000000000000 000002a10433f3d4
Dec 27 04:42:11 base   %l4-7: 00000000012cc400 000000007e600000  
00000000012cc400 0000000000000001
Dec 27 04:42:11 base genunix: [ID 723222 kern.notice] 000002a10433f410  
unix:ktl0+48 (2a10433fec0, 2a10433ff80, 180e580, 6, 180c000, 1800000)
Dec 27 04:42:11 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000000000000002 0000000000001400 0000000080001601 00000000012c1578
Dec 27 04:42:11 base   %l4-7: 0000000ae394c629 0000060017a32260  
000000000000000b 000002a10433f4c0
Dec 27 04:42:12 base genunix: [ID 723222 kern.notice] 000002a10433f560  
unix:resume+240 (300021c1a20, 180c000, 1835c40, 6001c1f20c8, 16,  
30001e4cc40)
Dec 27 04:42:12 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000000000000000 0000000000000000 00000180048279c0 000002a1035dbca0
Dec 27 04:42:12 base   %l4-7: 0000000000000001 0000000001867800  
0000000025be86dc 00000000018bbc00
Dec 27 04:42:12 base genunix: [ID 723222 kern.notice] 000002a10433f610  
genunix:cv_wait+3c (3001365ba10, 3001365ba10, 1, 18d0c00, c44000, 0)
Dec 27 04:42:12 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000000000c44002 00000000018d0e58 0000000000000001 0000000000c44002
Dec 27 04:42:12 base   %l4-7: 0000000000000000 0000000000000001  
0000000000000002 0000000001326e5c
Dec 27 04:42:12 base genunix: [ID 723222 kern.notice] 000002a10433f6c0  
zfs:zio_wait+30 (3001365b778, 6001cdcf7e8, 3001365ba18, 3001365ba10,  
30034dc1f48, 1)
Dec 27 04:42:12 base genunix: [ID 179002 kern.notice]   %l0-3:  
000006001cdcf7f0 000000000000ffff 0000000000000100 000000000000fc00
Dec 27 04:42:12 base   %l4-7: 00000000018d7000 000000000c6eefd9  
000000000c6eefd8 000000000c6eefd8
Dec 27 04:42:12 base genunix: [ID 723222 kern.notice] 000002a10433f770  
zfs:zil_commit_writer+2d0 (6001583be00, 4b0, 1b1a4d54, 42a03, cfc67, 0)
Dec 27 04:42:12 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000060018b5d068 ffffffffffffffff 0000060010ce1040 000006001583be88
Dec 27 04:42:12 base   %l4-7: 0000060013760380 00000000000000c0  
000003002bf81138 000003001365b778
Dec 27 04:42:13 base genunix: [ID 723222 kern.notice] 000002a10433f820  
zfs:zil_commit+68 (6001583be00, 1b1a5ae5, 38bc5, 6001583be7c,  
1b1a5ae5, 0)
Dec 27 04:42:13 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000000000000001 0000000000000001 00000600177fe080 000006001c1f2ad8
Dec 27 04:42:13 base   %l4-7: 00000000000001c0 0000000000000001  
0000060010c78000 0000000000000000
Dec 27 04:42:13 base genunix: [ID 723222 kern.notice] 000002a10433f8d0  
zfs:zfs_fsync+f8 (18e5800, 0, 134fc00, 3001c2c4860, 134fc00, 134fc00)
Dec 27 04:42:13 base genunix: [ID 179002 kern.notice]   %l0-3:  
000003001e94d948 0000000000010000 000000000180c008 0000000000000008
Dec 27 04:42:13 base   %l4-7: 0000060013760458 0000000000000000  
000000000134fc00 00000000018d2000
Dec 27 04:42:13 base genunix: [ID 723222 kern.notice] 000002a10433f980  
genunix:fop_fsync+40 (300131ed600, 10, 60011c08b68, 0, 60010c77200,  
30028320b40)
Dec 27 04:42:13 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000060011f6c828 0000000000000007 000006001c1f20c8 00000000013409d8
Dec 27 04:42:13 base   %l4-7: 0000000000000000 0000000000000001  
0000000000000000 00000000018bcc00
Dec 27 04:42:13 base genunix: [ID 723222 kern.notice] 000002a10433fa30  
genunix:fdsync+40 (7, 10, 0, 184, 10, 30007adda40)
Dec 27 04:42:13 base genunix: [ID 179002 kern.notice]   %l0-3:  
0000000000000000 000000000000f071 00000000f0710000 000000000000f071
Dec 27 04:42:13 base   %l4-7: 0000000000000001 000000000180c000  
0000000000000000 0000000000000000
Dec 27 04:42:14 base unix: [ID 100000 kern.notice]
Dec 27 04:42:14 base genunix: [ID 672855 kern.notice] syncing file  
systems...
Dec 27 04:42:14 base genunix: [ID 904073 kern.notice]  done
Dec 27 04:42:15 base SUNW,UltraSPARC-II: [ID 201454 kern.warning]  
WARNING: [AFT1] Uncorrectable Memory Error on CPU0 Data access at  
TL=0, errID 0x00167f73.1c737868
Dec 27 04:42:15 base     AFSR 0x00000001<ME>.80200000<PRIV,UE> AFAR  
0x00000000.a0b7ff60
Dec 27 04:42:15 base     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00  
Fault_PC 0x101b708
Dec 27 04:42:15 base     UDBH 0x0203<UE> UDBH.ESYND 0x03 UDBL 0x0000  
UDBL.ESYND 0x00
Dec 27 04:42:15 base     UDBH Syndrome 0x3 Memory Module U1402 U0402  
U1401 U0401
Dec 27 04:42:15 base SUNW,UltraSPARC-II: [ID 325743 kern.warning]  
WARNING: [AFT1] errID 0x00167f73.1c737868 Syndrome 0x3 indicates that  
this may not be a memory module problem
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 151010 kern.info] [AFT2]  
errID 0x00167f73.1c737868 PA=0x00000000.a0b7ff60
Dec 27 04:42:16 base     E$tag 0x00000000.1cc01416 E$State: Exclusive E 
$parity 0x0e
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E 
$Data (0x00): 0x0070ba48.00000000
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E 
$Data (0x08): 0x00000000.00000000
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E 
$Data (0x10): 0x00ec48b9.495349e1
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E 
$Data (0x18): 0x4955a237.00000000
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E 
$Data (0x20): 0x00000800.00000000 *Bad* PSYND=0xff00
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E 
$Data (0x28): 0x0070ba28.00000000
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E 
$Data (0x30): 0x00000000.00000000
Dec 27 04:42:16 base SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E 
$Data (0x38): 0x027a7ea6.494f4aeb
Dec 27 04:47:56 base genunix: [ID 540533 kern.notice] ^MSunOS Release  
5.11 Version snv_98 64-bit
Dec 27 04:47:56 base genunix: [ID 172908 kern.notice] Copyright  
1983-2008 Sun Microsystems, Inc.  All rights reserved.
Dec 27 04:47:56 base Use is subject to license terms.

My guess would be a broken CPU, Maybe the old Ecache-problem...

Anyway, "zfs mount space" works fine, but "zfs mount space/postfix"  
hangs. A look at the zfs-process shows:

# echo "0t236::pid2proc|::walk thread|::findstack -v" | mdb -k
stack pointer for thread 30001cecc00: 2a100fa2181
[ 000002a100fa2181 cv_wait+0x3c() ]
   000002a100fa2231 txg_wait_open+0x58(60014aa1158, d000b, 0,  
60014aa119c,
   60014aa119e, 60014aa1150)
   000002a100fa22e1 dmu_tx_assign+0x3c(60022dd3780, 1, 7, 60013cd5918,  
5b, 1)
   000002a100fa2391 dmu_free_long_range_impl+0xc4(600245fbdb0,  
60025f69750, 0,
   400, 0, 1)
   000002a100fa2451 dmu_free_long_range+0x44(600245fbdb0, 43b12, 0,
   ffffffffffffffff, 1348800, 0)
   000002a100fa2511 zfs_rmnode+0x68(60025bb6f20, 12, 600243af9e0, 1,  
600243af880
   , 600245fbdb0)
   000002a100fa25d1 zfs_inactive+0x134(600243af988, 0, 60025f6fef8,  
4000, 420,
   60025bb6f20)
   000002a100fa2681 zfs_rename+0x73c(6002401e400, 40800000004,  
6002401e400,
   60021860041, 60022dd3780, 60025bb6fe8)
   000002a100fa27c1 fop_rename+0xac(6002401e400, 60021860030,  
6002401e400,
   60021860041, 60010c03e08, 0)
   000002a100fa2881 zfs_replay_rename+0xb4(18bbc00, 6002400e8b0, 0,  
60014a94000,
   0, 0)
   000002a100fa2951 zil_replay_log_record+0x244(18d1ed0, 60017108000,  
2a100fa3450
   , 0, 6002347fc80, 60014a94000)
   000002a100fa2a41 zil_parse+0x160(58, 132573c, 13253a4, 2a100fa3450,  
cff2c,
   1978d7)
   000002a100fa2ba1 zil_replay+0xa4(9050200ff00ff, 600243af880,  
600243af8b0,
   40000, 60022ad91d8, 6002347fc80)
   000002a100fa2c81 zfsvfs_setup+0x94(600243af880, 1, 18d1c00,  
600151e8400,
   18d0c00, 0)
   000002a100fa2d31 zfs_domount+0x2dc(60011f08d08, 60022afe480,  
60011f08d08,
   600243af890, 0, 400)
   000002a100fa2e11 zfs_mount+0x1ec(60011f08d08, 6002401e200,  
2a100fa39d8, 100, 0
   , 2)
   000002a100fa2f71 domount+0xaf0(100, 1, 6002401e200, 8077,  
60011f08d08, 0)
   000002a100fa3121 mount+0xec(60023dd7388, 2a100fa3ad8, 0, ff104ed8,  
100, 45bd0
   )
   000002a100fa3221 syscall_ap+0x44(2a0, ffbfe8a8, 115b9e8,  
60023dd72d0, 15, 0)
   000002a100fa32e1 syscall_trap32+0xcc(45bd0, ffbfe8a8, 100,  
ff104ed8, 0, 0)


zpool status and fmdump don't indicate any problems.

Any possibility to recover the dataset? I do have backups of all data,  
but I would really like to be able to recover it to save some time.

Anything special to look for in zdb output? Any other diagnostics that  
would be useful?

Thanks in advance!

Best Regards //Magnus


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to