LORs (was Re: ghosthunting: machine freeze 6.2R)

Volker Fri, 25 May 2007 08:25:49 -0700

On 05/25/07 13:45, Volker wrote:

Using a debug kernel, the machine came up quickly with this LOR afterthe reboot:
lock order reversal:
 1st 0xc077078c tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:625
2nd 0xc4f18180 pf task mtx (pf task mtx) @/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386
KDB: stack backtrace:
kdb_backtrace(0,ffffffff,c072fcd0,c072e1c8,c06f6124,...) atkdb_backtrace+0x29
witness_checkorder(c4f18180,9,c4f1536e,18f2) at witness_checkorder+0x578
_mtx_lock_flags(c4f18180,0,c4f1536e,18f2,c4f18180,...) at_mtx_lock_flags+0x78
pf_test(2,c4bdec00,e35c5ac4,0,0,...) at pf_test+0x81
pf_check_out(0,e35c5ac4,c4bdec00,2,0) at pf_check_out+0x3d
pfil_run_hooks(c0770340,e35c5b40,c4bdec00,2,0,...) at pfil_run_hooks+0xc9
ip_output(c50c8200,0,e35c5b0c,0,0,0) at ip_output+0x83a
tcp_respond(0,c4f85810,c4f85824,c50c8200,0,7a481ad6,4) at tcp_respond+0x3e1
tcp_input(c50c8200,14,1,93d306d9,0,...) at tcp_input+0x3124
ip_input(c50c8200) at ip_input+0x785
netisr_processqueue(c076dfd8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2
ithread_execute_handlers(c4afca78,c4b4b180) atithread_execute_handlers+0xe6ithread_loop(c4adb990,e35c5d38,c4adb990,c0505918,0,...) atithread_loop+0x67
fork_exit(c0505918,c4adb990,e35c5d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe35c5d6c, ebp = 0 ---
Expensive timeout(9) function: 0xc0528fb4(0) 0.002565972 s


This first one appeared at 13:22 (short after bootup).

ok, the next two LORs (similar to the first):


at 13:28 this one came into the logs:

lock order reversal:
 1st 0xc077078c tcp (tcp) @ /usr/src/sys/netinet/tcp_input.c:625

2nd 0xc4f18180 pf task mtx (pf task mtx) @/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:6386

KDB: stack backtrace:

kdb_backtrace(0,ffffffff,c072fcd0,c072e1c8,c06f6124,...) atkdb_backtrace+0x29

witness_checkorder(c4f18180,9,c4f1536e,18f2) at witness_checkorder+0x578

_mtx_lock_flags(c4f18180,0,c4f1536e,18f2,c4f18180,...) at_mtx_lock_flags+0x78

pf_test(2,c4bdec00,e35c5ac4,0,0,...) at pf_test+0x81
pf_check_out(0,e35c5ac4,c4bdec00,2,0) at pf_check_out+0x3d

pfil_run_hooks(c0770340,e35c5b40,c4bdec00,2,0,...) atpfil_run_hooks+0xc9

ip_output(c50c8200,0,e35c5b0c,0,0,0) at ip_output+0x83a

tcp_respond(0,c4f85810,c4f85824,c50c8200,0,7a481ad6,4) attcp_respond+0x3e1

tcp_input(c50c8200,14,1,93d306d9,0,...) at tcp_input+0x3124
ip_input(c50c8200) at ip_input+0x785
netisr_processqueue(c076dfd8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2

ithread_execute_handlers(c4afca78,c4b4b180) atithread_execute_handlers+0xe6ithread_loop(c4adb990,e35c5d38,c4adb990,c0505918,0,...) atithread_loop+0x67

fork_exit(c0505918,c4adb990,e35c5d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe35c5d6c, ebp = 0 ---
Expensive timeout(9) function: 0xc0528fb4(0) 0.002565972 s

At 16:55 I catched this message:

kernel: acpi: suspend request ignored (not ready yet)

A minute (or seconds?) the machine died and I did not get anythingaround that time into the logs. What's the reason for this ACPI message?

After bootup (reset key pressed by an operator), the machine broughtthis LOR:


lock order reversal:

1st 0xc4f68180 pf task mtx (pf task mtx) @/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:63862nd 0xc077078c tcp (tcp) @/usr/src/sys/modules/pf/../../contrib/pf/net/pf.c:2744

KDB: stack backtrace:

kdb_backtrace(0,ffffffff,c072e1c8,c072fcd0,c06f6124,...) atkdb_backtrace+0x29

witness_checkorder(c077078c,9,c4f6536e,ab8) at witness_checkorder+0x578

_mtx_lock_flags(c077078c,0,c4f6536e,ab8,c077078c,...) at_mtx_lock_flags+0x78pf_socket_lookup(e35c5b00,e35c5b04,1,e35c5bc0,0,...) atpf_socket_lookup+0x1d3pf_test_tcp(e35c5b70,e35c5b68,1,c4ee0e00,c4d6f400,...) atpf_test_tcp+0x11e6

pf_test(1,c4c11c00,e35c5c5c,0,0,...) at pf_test+0xb8b
pf_check_in(0,e35c5c5c,c4c11c00,1,0) at pf_check_in+0x37
pfil_run_hooks(c0770340,e35c5cb4,c4c11c00,1,0) at pfil_run_hooks+0xc9
ip_input(c4d6f400) at ip_input+0x272
netisr_processqueue(c076dfd8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2

ithread_execute_handlers(c4afca78,c4b4b180) atithread_execute_handlers+0xe6ithread_loop(c4adb990,e35c5d38,c4adb990,c0505918,0,...) atithread_loop+0x67

fork_exit(c0505918,c4adb990,e35c5d38) at fork_exit+0xa0
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe35c5d6c, ebp = 0 ---

My assumption: The LORs are somewhat pf related but are not relatedto the lockdown of the system. Am I correct? What might be reasonfor that ACPI message and may ACPI be a cause of the lockdown? Whatmight be a possible cause for WITNESS and INVARIANTS being unable tocatch whatever causes the freeze?


Thx

Volker

PS: sorry for flooding this list, should I direct postings to [EMAIL PROTECTED]
PPS: Is anybody able to provide me patches for these LORs?
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

LORs (was Re: ghosthunting: machine freeze 6.2R)

Reply via email to