On 27-Dec-06, at 9:45 PM, George Wilson wrote:
Siegfried,
Can you provide the panic string that you are seeing? We should be
able to pull out the persistent error log information from the
corefile. You can take a look at spa_get_errlog() function as a
starting point.
This is the panic string that I am seeing:
Dec 26 18:55:51 FServe unix: [ID 836849 kern.notice]
Dec 26 18:55:51 FServe ^Mpanic[cpu1]/thread=fffffe8000929c80:
Dec 26 18:55:51 FServe genunix: [ID 683410 kern.notice] BAD TRAP:
type=e (#pf Page fault) rp=fffffe8000929980 addr=ffffff00b3e621f0
Dec 26 18:55:51 FServe unix: [ID 100000 kern.notice]
Dec 26 18:55:51 FServe unix: [ID 839527 kern.notice] sched:
Dec 26 18:55:51 FServe unix: [ID 753105 kern.notice] #pf Page fault
Dec 26 18:55:51 FServe unix: [ID 532287 kern.notice] Bad kernel fault
at addr=0xffffff00b3e621f0
Dec 26 18:55:51 FServe unix: [ID 243837 kern.notice] pid=0,
pc=0xfffffffff3eaa2b0, sp=0xfffffe8000929a78, eflags=0x10282
Dec 26 18:55:51 FServe unix: [ID 211416 kern.notice] cr0:
8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse>
Dec 26 18:55:51 FServe unix: [ID 354241 kern.notice] cr2:
ffffff00b3e621f0 cr3: a3ec000 cr8: c
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice] rdi:
fffffe80dd69ad40 rsi: ffffff00b3e62040 rdx: 0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice] rcx:
ffffffff9c6bd6ce r8: 1 r9: ffffffff
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice] rax:
ffffff00b3e62208 rbx: ffffff00b3e62040 rbp: fffffe8000929ab0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice] r10:
ffffffff982421c8 r11: 1 r12: ffffff00b3e62208
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice] r13:
ffffffff81204468 r14: 1c8 r15: fffffe80dd69ad40
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice] fsb:
ffffffff80000000 gsb: ffffffff80f1d000 ds: 43
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]
es: 43 fs: 0 gs: 1c3
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]
trp: e err: 0 rip: fffffffff3eaa2b0
Dec 26 18:55:51 FServe unix: [ID 592667 kern.notice]
cs: 28 rfl: 10282 rsp: fffffe8000929a78
Dec 26 18:55:51 FServe unix: [ID 266532 kern.notice]
ss: 30
Dec 26 18:55:51 FServe unix: [ID 100000 kern.notice]
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929890 unix:real_mode_end+6ad1 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929970 unix:trap+d77 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929980 unix:cmntrap+13f ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929ab0 zfs:vdev_queue_offset_compare+0 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929ae0 genunix:avl_add+1f ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929b60 zfs:vdev_queue_io_to_issue+1ec ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929ba0 zfs:zfsctl_ops_root+33bc48b1 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929bc0 zfs:vdev_disk_io_done+11 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929bd0 zfs:vdev_io_done+12 ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929be0 zfs:zio_vdev_io_done+1b ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929c60 genunix:taskq_thread+bc ()
Dec 26 18:55:51 FServe genunix: [ID 655072 kern.notice]
fffffe8000929c70 unix:thread_start+8 ()
Dec 26 18:55:51 FServe unix: [ID 100000 kern.notice]
Dec 26 18:55:51 FServe genunix: [ID 672855 kern.notice] syncing file
systems...
Dec 26 18:55:51 FServe genunix: [ID 733762 kern.notice] 3
Dec 26 18:55:52 FServe genunix: [ID 904073 kern.notice] done
Dec 26 18:55:53 FServe genunix: [ID 111219 kern.notice] dumping to /
dev/dsk/c1d0s1, offset 1719074816, content: kernel
Additionally, but perhaps not related, I came across this while
looking at the logs:
Dec 26 17:53:00 FServe marvell88sx: [ID 812950 kern.warning] WARNING:
marvell88sx0: error on port 1:
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]
SError interrupt
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info] EDMA
self disabled
Dec 26 17:53:00 FServe marvell88sx: [ID 517869 kern.info]
command request queue parity error
Dec 26 17:53:00 FServe marvell88sx: [ID 131198 kern.info] SErrors:
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info] Recovered communication error
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info] PHY ready change
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info] 10-bit to 8-bit decode error
Dec 26 17:53:00 FServe marvell88sx: [ID 517869
kern.info] Disparity error
This happened right before a system hang. I have this other strange
problem where if I send certain files over the network (CIFS or NFS),
the machine slows to a crawl until it is "hung". This is
reproducible every time with the same "special" files, but it does
not happen locally, only over the network. I already posted about
this in network-discuss and am currently investigating the issue.
Additionally, you can look at the corefile using mdb and take a
look at the vdev error stats. Here's an example (hopefully the
formatting doesn't get messed up):
Excellent information, thanks! It looks like there are no read/write/
chksum errors.
I now at least have a way of checking the scrub results until the
panic is fixed (hopefully someday).
Siegfried
> ::spa -v
ADDR STATE NAME
0000060004473680 ACTIVE test
ADDR STATE AUX DESCRIPTION
0000060004bcb500 HEALTHY - root
0000060004bcafc0 HEALTHY - /dev/dsk/c0t2d0s0
> 0000060004bcb500::vdev -re
ADDR STATE AUX DESCRIPTION
0000060004bcb500 HEALTHY - root
READ WRITE FREE CLAIM
IOCTL
OPS 0 0 0
0 0
BYTES 0 0 0
0 0
EREAD 0
EWRITE 0
ECKSUM 0
0000060004bcafc0 HEALTHY - /dev/dsk/c0t2d0s0
READ WRITE FREE CLAIM
IOCTL
OPS 0x17 0x1d2 0
0 0
BYTES 0x19c000 0x11da00 0
0 0
EREAD 0
EWRITE 0
ECKSUM 0
This will show you and read/write/cksum errors.
Thanks,
George
Siegfried Nikolaivich wrote:
Hello All,
I am wondering if there is a way to save the scrub results right
before the scrub is complete.
After upgrading to Solaris 10U3 I still have ZFS panicing right as
the scrub completes. The scrub results seem to be "cleared" when
system boots back up, so I never get a chance to see them.
Does anyone know of a simple way?
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss