I have a Sun Fire T1000 (sparc64), which a while ago was occasionally panicking, and I submitted a bug. kettenis@ commited a fix, and it stopped panicking. All good.
Now I have a different problem. Every now and then, it just hangs. As far as I can tell, its a complete hardlock. I can't get it to go to ddb, I can't ping the box, it just sits there. I've been running screen on a machine connected to the serial console, running while true; do uptime; sleep 5; done for the past few days, and it seems that when it hangs, it does so whilst updating the various mirrors the box hosts, at around 3am. I suspect this is more to do with load than the specific task. My question is this: Is there anything I can do to gather useful information? At the moment all I have is a dmesg and "It hangs sometimes under load." which is bugger all use to anyone. Is there a magic key sequence I can send to the serial console? Is there a sysctl I can turn on? Is there a ukc or config knob I can poke? I don't mind running with crazy debug symbols or silly cronjobs that chuck debug data to disk once a minute, I just want to be able to put something helpful in sendbug(1). Just in case, dmesg below. -- SD slash:~# dmesg console is /virtual-devi...@100/cons...@1 Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2009 OpenBSD. All rights reserved. http://www.OpenBSD.org OpenBSD 4.4-current (GENERIC.MP) #584: Thu Jan 8 18:39:22 MST 2009 t...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP real mem = 17171480576 (16376MB) avail mem = 16805117952 (16026MB) mainbus0 at root: Sun Fire(TM) T1000 cpu0 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu1 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu2 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu3 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu4 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu5 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu6 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu7 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu8 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu9 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu10 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu11 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu12 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu13 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu14 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu15 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu16 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu17 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu18 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu19 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu20 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu21 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu22 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu23 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu24 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu25 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu26 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu27 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu28 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu29 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu30 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz cpu31 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz vbus0 at mainbus0 "nvram" at vbus0 not configured "flashprom" at vbus0 not configured vcons0 at vbus0: ivec 0x111 "ncp" at vbus0 not configured vrtc0 at vbus0 "loop" at vbus0 not configured "loop" at vbus0 not configured "echo" at vbus0 not configured "fma" at vbus0 not configured "sunvts" at vbus0 not configured "sunmc" at vbus0 not configured "explorer" at vbus0 not configured "led" at vbus0 not configured vpci0 at mainbus0: bus 2 to 2, dvma map 80000000-ffffffff pci0 at vpci0 vpci1 at mainbus0: bus 2 to 4, dvma map 80000000-ffffffff pci1 at vpci1 ppb0 at pci1 dev 0 function 0 "ServerWorks PCIE-PCIX" rev 0xb3 pci2 at ppb0 bus 3 bge0 at pci2 dev 4 function 0 "Broadcom BCM5714" rev 0xa2, BCM5715 A1 (0x9001): ivec 0x7d4, address 00:14:4f:2c:f7:e2 brgphy0 at bge0 phy 1: BCM5714 10/100/1000baseT/SX PHY, rev. 0 bge1 at pci2 dev 4 function 1 "Broadcom BCM5714" rev 0xa2, BCM5715 A1 (0x9001): ivec 0x7d5, address 00:14:4f:2c:f7:e3 brgphy1 at bge1 phy 1: BCM5714 10/100/1000baseT/SX PHY, rev. 0 ppb1 at pci2 dev 8 function 0 "ServerWorks HT-1000 PCIX" rev 0xb3 pci3 at ppb1 bus 4 bge2 at pci3 dev 1 function 0 "Broadcom BCM5704C" rev 0x10, BCM5704 B0 (0x2100): ivec 0x7c2, address 00:14:4f:2c:f7:e4 brgphy2 at bge2 phy 1: BCM5704 10/100/1000baseT PHY, rev. 0 bge3 at pci3 dev 1 function 1 "Broadcom BCM5704C" rev 0x10, BCM5704 B0 (0x2100): ivec 0x7c1, address 00:14:4f:2c:f7:e5 brgphy3 at bge3 phy 1: BCM5704 10/100/1000baseT PHY, rev. 0 mpi0 at pci3 dev 2 function 0 "Symbios Logic SAS1064" rev 0x02: ivec 0x7c0 scsibus0 at mpi0: 63 targets, initiator 63 sd0 at scsibus0 targ 0 lun 0: <ATA, ST31000340AS, SD15> SCSI3 0/direct fixed sd0: 953869MB, 512 bytes/sec, 1953525168 sec total ebus0 at mainbus0 com0 at ebus0 addr c2c000-c2c007 ivec 0xa: st16650, 32 byte fifo softraid0 at root bootpath: /p...@7c0,0/p...@0,0/p...@8,0/s...@2,0/d...@0,0 root on sd0a swap on sd0b dump on sd0b WARNING: / was not properly unmounted slash:~# sysctl hw hw.machine=sparc64 hw.model=SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz hw.ncpu=32 hw.byteorder=4321 hw.pagesize=8192 hw.disknames=sd0 hw.diskcount=1 hw.cpuspeed=1000 hw.vendor=Sun hw.product=Fire hw.physmem=17171480576 hw.usermem=17171464192 slash:~#