On Thu, Jun 21, 2007 at 03:02:52PM -0700, Ted Unangst wrote: > On 6/21/07, andrew fresh <[EMAIL PROTECTED]> wrote: > >I have several routers that have been running great for many months. > >(even better since I upgraded to 4.1 on them oround May 4th) > > > >OpenBSD 4.1-stable (GENERIC.MP) #0: Fri May 4 21:56:51 MST 2007 > > > >This morning, one of them went down and nagios paged me. Getting to > >work, I just thought it was odd, looked at the trace and restarted it > >and went home. About half an hour later, it happened again. I again > > what happens if you push c and enter?
Finally got to find out. The router DDBd again. It isn't all that useful. ddb{1}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al And that same thing for the 10 or so times I tried it. Below is the log, the first bit is the first time is DDBd, I didn't get the full trace that time, but within about 5 minutes it did it again and did get the trace, ps and even a show registers. It has been OK again for about an hour, but if there is something else that would probide more information, please let me know and if it happens again I can try that. l8rZ, -- andrew - ICQ# 253198 - Jabber: [EMAIL PROTECTED] "When the grammar checker identifies an error, it suggests a correction and can even makes some changes for you." - Microsoft Word for Windows 2.0 User's Guide. =~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2007.07.23 13:10:58 =~=~=~=~=~=~=~=~=~=~=~= ddb{0}> ddb{0}> sh panic the kernel did not panic ddb{0}> trace db_read_bytes(0,1,e7f2fd5c,2,0) at db_read_bytes+0x14 db_get_value(0,1,0,d067dbc3,0) at db_get_value+0x19 db_disasm(0,0,d033f310,0,50) at db_disasm+0x1d db_print_loc_and_inst(0,e7f2fe14,e7f2fe2c,d0473534,0) at db_print_loc_and_inst+ 0x2d db_trap(6,0,e7f2fe4c,d04642dd,1) at db_trap+0x75 kdb_trap(6,0,e7f2fe94,50) at kdb_trap+0xe8 trap() at trap+0xa1 --- trap (number 6) --- (null)(0,d1229240,0,e7f2e000,0) at 0 softclock(e7f20058,e7f20010,10,e7f20010,e7f2e000) at softclock+0x22c Bad frame pointer: 0xe7f2ff20 ddb{0}> boot sync syncing disks... panic: tsleep: not SONPROC Stopped at Debugger+0x4: leave RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! ddb{0}> boot sync rebooting... >> OpenBSD/i386 BOOT 2.10 boot> booting hd0a:/bsd: \|/-\|/-5611032\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/-\|/+882424 [52+286400-\|/-\|/-\|/-\|/-+266500\|/-\|/-\|/-\|/-\]=0x6b867c entry point at 0x200120* [ using 553324 bytes of bsd ELF symbol table ] Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2007 OpenBSD. All rights reserved. http://www.OpenBSD.org OpenBSD 4.1-stable (GENERIC.MP) #0: Fri May 4 21:56:51 MST 2007 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC.MP cpu0: Intel Pentium III ("GenuineIntel" 686-class) 732 MHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE real mem = 536436736 (523864K) avail mem = 481710080 (470420K) using 4278 buffers containing 26943488 bytes (26312K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+ BIOS, date 08/04/03, BIOS32 rev. 0 @ 0xffe90, SMBIOS rev. 2.3 @ 0xfafc0 (51 entries) bios0: Dell Computer Corporation PowerEdge 2450 pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000 pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfc2c0/144 (7 entries) pcibios0: PCI Interrupt Router at 000:15:0 ("ServerWorks OSB4" rev 0x00) pcibios0: PCI bus #1 is the last bus bios0: ROM list: 0xc0000/0x8000 0xc8000/0x6000 0xec000/0x4000! acpi at mainbus0 not configured mainbus0: Intel MP Specification (Version 1.4) cpu0 at mainbus0: apid 1 (boot processor) cpu0: apic clock running at 132 MHz cpu1 at mainbus0: apid 0 (application processor) cpu1: Intel Pentium III ("GenuineIntel" 686-class) 732 MHz cpu1: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE mainbus0: bus 0 is type PCI mainbus0: bus 1 is type PCI mainbus0: bus 2 is type PCI mainbus0: bus 3 is type PCI mainbus0: bus 4 is type ISA ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 11, 16 pins ioapic0: misconfigured as apic 0, remapped to apid 2 ioapic1 at mainbus0: apid 3 pa 0xfec01000, version 11, 16 pins ioapic1: misconfigured as apic 0, remapped to apid 3 esm0 at mainbus0 esm0: PowerEdge 2450 Embedded Server Management 5.24 esm0: Primary System Backplane 1.16 pci0 at mainbus0 bus 0: configuration mode 1 (no bios) pchb0 at pci0 dev 0 function 0 "ServerWorks CNB20LE Host" rev 0x05 pchb1 at pci0 dev 0 function 1 "ServerWorks CNB20LE Host" rev 0x05 pci1 at pchb1 bus 2 ppb0 at pci1 dev 2 function 0 "Intel i960 RM PCI-PCI" rev 0x01 pci2 at ppb0 bus 3 ahc0 at pci2 dev 4 function 0 "Adaptec AIC-7899 U160" rev 0x01: apic 3 int 15 (irq 10) scsibus0 at ahc0: 16 targets sd0 at scsibus0 targ 0 lun 0: <QUANTUM, ATLAS 10K 9SCA, UCIE> SCSI3 0/direct fixed sd0: 8683MB, 10042 cyl, 6 head, 295 sec, 512 bytes/sec, 17783249 sec total sd1 at scsibus0 targ 1 lun 0: <QUANTUM, ATLAS 10K 9SCA, UCIE> SCSI3 0/direct fixed sd1: 8683MB, 10042 cyl, 6 head, 295 sec, 512 bytes/sec, 17783249 sec total safte0 at scsibus0 targ 6 lun 0: <DELL, 1x4 U2W SCSI BP, 1.16> SCSI2 3/processor fixed ahc1 at pci2 dev 4 function 1 "Adaptec AIC-7899 U160" rev 0x01: apic 3 int 14 (irq 11) scsibus1 at ahc1: 16 targets fxp0 at pci1 dev 8 function 0 "Intel 8255x" rev 0x08, i82559: apic 3 int 0 (irq 11), address 00:b0:d0:20:8a:b1 inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4 ppb1 at pci0 dev 2 function 0 "DEC 21152 PCI-PCI" rev 0x03 pci3 at ppb1 bus 1 fxp1 at pci3 dev 4 function 0 "Intel 8255x" rev 0x05, i82558: apic 3 int 4 (irq 11), address 00:50:8b:5e:e7:ac inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 0 fxp2 at pci3 dev 5 function 0 "Intel 8255x" rev 0x05, i82558: apic 3 int 5 (irq 10), address 00:50:8b:5e:e7:ad inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 0 san0 at pci0 dev 4 function 0 "Sangoma A10x" rev 0x01 apic 3 int 1 (irq 11) san1 at pci0 dev 8 function 0 "Sangoma A10x" rev 0x01 apic 3 int 6 (irq 10) vga1 at pci0 dev 14 function 0 "ATI Mach64 GY" rev 0x7a wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) piixpm0 at pci0 dev 15 function 0 "ServerWorks OSB4" rev 0x4f: SMI iic0 at piixpm0 pciide0 at pci0 dev 15 function 1 "ServerWorks OSB4 IDE" rev 0x00: DMA atapiscsi0 at pciide0 channel 0 drive 0 scsibus2 at atapiscsi0: 2 targets cd0 at scsibus2 targ 0 lun 0: <SAMSUNG, CD-ROM SN-124, S003> SCSI0 5/cdrom removable cd0(pciide0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 2 ohci0 at pci0 dev 15 function 2 "ServerWorks OSB4/CSB5 USB" rev 0x04: apic 2 int 5 (irq 5), version 1.0, legacy support usb0 at ohci0: USB revision 1.0 uhub0 at usb0 uhub0: ServerWorks OHCI root hub, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered isa0 at mainbus0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard, using wsdisplay0 pcppi0 at isa0 port 0x61 midi0 at pcppi0: <PC speaker> spkr0 at pcppi0 lpt0 at isa0 port 0x378/4 irq 7 npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom0: console pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo fdc0 at isa0 port 0x3f0/6 irq 6 drq 2 fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec pctr: 686-class user-level performance counters enabled mtrr: Pentium Pro MTRR support ahc0: target 0 using 16bit transfers ahc0: target 0 synchronous at 80.0MHz DT, offset = 0x1f dkcsum: sd0 matches BIOS drive 0x80 ahc0: target 1 using 16bit transfers ahc0: target 1 synchronous at 80.0MHz DT, offset = 0x1f dkcsum: sd1 matches BIOS drive 0x81 root on sd0a rootdev=0x400 rrootdev=0xd00 rawdev=0xd02 WARNING: / was not properly unmounted Automatic boot in progress: starting file system checks. /dev/rsd0a: 1780 files, 21087 used, 43258 free (362 frags, 5362 blocks, 0.6% fragmentation) /dev/rsd0a: MARKING FILE SYSTEM CLEAN /dev/rsd0h: 995 files, 32540 used, 31821 free (69 frags, 3969 blocks, 0.1% fragmentation) /dev/rsd0h: MARKING FILE SYSTEM CLEAN /dev/rsd0d: 54 files, 93 used, 128638 free (126 frags, 16064 blocks, 0.1% fragmentation) /dev/rsd0d: MARKING FILE SYSTEM CLEAN /dev/rsd0g: 8845 files, 90879 used, 166144 free (2256 frags, 20486 blocks, 0.9% fragmentation) /dev/rsd0g: MARKING FILE SYSTEM CLEAN /dev/rsd0e: 728 files, 3360 used, 61001 free (25 frags, 7622 blocks, 0.0% fragmentation) /dev/rsd0e: MARKING FILE SYSTEM CLEAN /dev/rsd0f: 68 files, 3585 used, 125146 free (130 frags, 15627 blocks, 0.1% fragmentation) /dev/rsd0f: MARKING FILE SYSTEM CLEAN setting tty flags pf enabled net.inet.ip.forwarding: 0 -> 1 starting network add host 207.173.230.131: gateway 66.185.224.3 add net 216.190.36.132/30: gateway 66.185.224.3 add net 216.190.36.136/30: gateway 66.185.224.3san0: Bringing interface up. san0: Configuring A101 PMC T1/E1/J1 Front End san0: Link connecting... san0: Bringing interface up. add net 216.190.36.140/30: gateway 66.185.224.3 add net 216.190.36.144/30: gateway 66.185.224.3 add net 66.185.224.0/20: gatseway 66.185.224.a1 add net 10.0.n0.0/8: gateway 606.185.224.1 add: net 192.168.0.0 /16: gateway 66.U185.224.1 nadd host 144.228k.242.172: gatewany 144.228.193.61o wn signal (00). san0: T1 connected! san0: Link connected! san0: T1 YELLOW ON san0: T1 disconnected! san0: Link connecting... san1: Bringing interface up. san1: Configuring A101 PMC T1/E1/J1 Front End san1: Link connecting... san1: Bringing interface up. san1: Unknown signal (00). san1: T1 connected! san1: Link connected! san1: T1 YELLOW ON san1: T1 disconnected! san1: Link connecting... san2: Bringing interface up. san2: Configuring A101 PMC T1/E1/J1 Front End san2: Link connecting... san2: Bringing interface up. add host 12.123.145.122: gateway 12.124.16.21 add net 135.89.154.144/28: gatewsay 12.124.16.21a add net 135.89.n152.48/28: gatew2ay 12.124.16.21: Unknown signal (00). san2: T1 connected! san2: Link connected! san2: T1 YELLOW ON san2: T1 disconnected! san2: Link connecting... san3: Bringing interface up. san3: Configuring A101 PMC T1/E1/J1 Front End san3: Link connecting... san3: Bringing interface up. san3: Unknown signal (00). san3: T1 connected! san3: Link connected! san3: T1 YELLOW ON san3: T1 disconnected! san3: Link connecting... san0: T1 YELLOW OFF san0: T1 connected! san0: Link connected! san3: T1 YELLOW OFF san3: T1 connected! san3: Link connected! san3: T1 LB activation code received. san3: Unknown signal (15). starting system logger starting initial daemons: ntpd. savecore: no core dump checking quotas: done. building ps databases: kvm dev. clearing /tmp starting pre-securelevel daemons:. setting kernel security level: kern.securelevel: 0 -> 1 creating runtime link editor directory cache. preserving editor files starting network daemons: bgpd sendmail inetd sshd. starting local daemons: snmpd. standard daemons: cron. Mon Jul 23 13:11:23 MST 2007 OpenBSD/i386 (rrlhcrtr0200.lhc.redrivernet.com) (tty00) login: san2: LCP keepalive timeout kernel: page fault trap, code=0 Stopped at 0: kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> trace db_read_bytes(0,1,e7f2fd5c,2,0) at db_read_bytes+0x14 db_get_value(0,1,0,d067dbc3,0) at db_get_value+0x19 db_disasm(0,0,d033f310,0,50) at db_disasm+0x1d db_print_loc_and_inst(0,e7f2fe14,e7f2fe2c,d0473534,0) at db_print_loc_and_inst+ 0x2d db_trap(6,0,e7f2fe4c,d04642dd,1) at db_trap+0x75 kdb_trap(6,0,e7f2fe94,50) at kdb_trap+0xe8 trap() at trap+0xa1 --- trap (number 6) --- (null)(0,d1229240,0,e7f2e000,0) at 0 softclock(e7f20058,e7f20010,10,e7f20010,e7f2e000) at softclock+0x22c Bad frame pointer: 0xe7f2ff20 ddb{0}> ps PID PPID PGRP UID S FLAGS WAIT COMMAND 30740 24096 30740 1000 3 0x2004082 ttyin ksh 24096 6922 6922 1000 3 0x2000180 select sshd 6922 3191 6922 0 3 0x2004080 netio sshd 5799 1 5799 0 3 0x2004082 ttyin getty 19371 1 19371 0 3 0x2004082 ttyin getty 9538 1 9538 0 3 0x2004082 ttyin getty 16680 1 16680 0 3 0x2004082 ttyin getty 7757 1 7757 0 3 0x2004082 ttyin getty 19265 1 19265 0 3 0x2004082 ttyin getty 13221 1 13221 0 3 0x2040180 select sendmail 19942 1 19942 0 3 0x2000080 select cron 31574 1 17848 0 3 0x2000080 select snmpd 3191 1 3191 0 3 0x2000080 select sshd 29797 1 29797 0 3 0x2000180 select inetd 8083 2029 2029 75 3 0x2000180 poll bgpd 14056 2029 2029 75 3 0x2000180 poll bgpd 2029 1 2029 0 2 0x2000080 bgpd 7386 3123 3123 83 3 0x2000180 poll ntpd 3123 1 3123 0 3 0x2000080 poll ntpd 29684 12345 12345 74 3 0x2000180 bpf pflogd 12345 1 12345 0 3 0x2000080 netio pflogd 26993 20475 20475 73 2 0x2000180 syslogd 20475 1 20475 0 3 0x2000088 netio syslogd 14 0 0 0 3 0x2100200 crypto_wa crypto 13 0 0 0 3 0x2100200 aiodoned aiodoned 12 0 0 0 3 0x2100200 syncer update 11 0 0 0 3 0x2100200 cleaner cleaner 10 0 0 0 3 0x100200 reaper reaper 9 0 0 0 3 0x2100200 pgdaemon pagedaemon 8 0 0 0 3 0x2100200 pftm pfpurge 7 0 0 0 3 0x2100200 wait wskbd_hotkey 6 0 0 0 3 0x2100200 usbtsk usbtask 5 0 0 0 3 0x2100200 usbevt usb0 4 0 0 0 3 0x2100200 timeout sensors 3 0 0 0 3 0x2100200 slacking scsi 2 0 0 0 3 0x2100200 kmalloc kmthread 1 0 1 0 3 0x2004080 wait init 0 -1 0 0 3 0x2080200 scheduler swapper ddb{0}> machine ddb 1 Stopped at Debugger+0x4: leave ddb{1}> sh panic the kernel did not panic ddb{1}> trace Debugger(d122cc00,0,0,d6c8ed14,0) at Debugger+0x4 i386_ipi_handler(b0,58,10,d0790010,e7f30010) at i386_ipi_handler+0x57 Xintripi() at Xintripi+0x47 --- interrupt --- i386_softintlock(0,d0200058,e7f30010,10,e7f30010) at i386_softintlock+0x65 Xintrltimer() at Xintrltimer+0x47 --- interrupt --- apm_cpu_idle(0,0,0,0,0) at apm_cpu_idle+0x4a ddb{1}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> c kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> sh buf vp 0xc00000b8 lblkno 0x89cfc00000908b02 blkno 0x908902ca83f055 dev 0x73d0652f proc 0xf0 error 0 flags 35421343400<DONE,EINTR,NOCACHE,PHYS,DEFERRED,SCANNED> bufsize 0xe845c7 bcount 0x76000000 resid 0x6cfe8163 sync 0xcfffffff data 0xc1f0895b saveaddr 0xe0c10ce8 dep 0xd039ffff iodone 0x8dcfc000 dirty {off 0x31548dcf end 0x81f089ff} valid {off 0xfff000e2 end 0xf00025ff} ddb{0}> sh map MAP 0xd04647e0: [0xe0c10ce8->0x908b02] #ent=-1073741680, sz=3965028815, ref=37003, version=1962052032, flags=0 xd8220fd8 kernel: page fault trap, code=0 Faulted in DDB; continuing... ddb{0}> sh page PAGE 0xd04647e0: flags=105d<BUSY,TABLED,CLEAN,CLEANCHK,FAKE,PAGER1>, pqflags=0, vers=17863, wi re_count=240, pa=0x0 uobject=0xfe81ff31, uanon=0x8758b0c, offset=0x8b0c4d8bcfffffff loan_count=-33 0971392 [page ownership tracking disabled] ddb{0}> sh regi ds 0x10 es 0x10 fs 0x58 gs 0x10 edi 0xd079e138 ddb_regs+0x58 esi 0x1 ebp 0xe7f2fd3c ebx 0xe7f2fd5c edx 0 ecx 0 eax 0x1 eip 0xd04647e0 db_read_bytes+0x14 cs 0x8 eflags 0x10202 esp 0xe7f2fd38 ss 0xe7f20010 db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> cont kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> continue kernel: page fault trap, code=0 Stopped at db_read_bytes+0x14: movb 0(%edx),%al ddb{0}> boot halt The operating system has halted. Please press any key to reboot.