Stefan Lambrev wrote:
Hi,
Can you replace all calls to rtfree() with RTFREE_LOCKED() in those files:
netinet/if_ether.c
netinet6/nd6_nbr.c
netinet6/in6_ifattach.c
netinet6/in6_gif.c
Of course do not forget net/route.c with the patch from the PR.
Recompile the kernel and check if this will cure your hangs?
I'm not sure about the lock order reversal, may be it was introduced
with kbd_backtrace().
You can remove it from route.c, replace rtfree() and build kernel with
debug, to see if the LOR is gone.
It seems that the panic is caused by rtalloc1() called in route.c line
333 :
rt = rtalloc1(dst, 0, 0UL); /* NB: rt is locked */
most probably because rt is not locked :)
I'm out of ideas how to check if it is really locked, but you can
experiment with RT_LOCK() and RT_UNLOCK().
May be mtx_trylock() can help too.
Please share your findings with -net & -current if you did not before.
=cut=
Unfortunately I ran out of time before I could complete the test.
However, I can report one more interesting finding from today: The icmp
packets that triggers the bug probably comes either from a Cisco router
or the setup itself.
Late today our network topology was changed,
Previous setup:
affected hosts ISP's router (default gw)
.1
LAN ------------ router-------- wlan 1 (via ISP)
| 192.168.3.0
our firewall .254 |
fw ----------wlan 2
| 172.16.2.0 (isakmpd)
|
Internet
Current setup:
affected hosts our fw (OpenBSD)
.1 192.168.3.0
LAN ------------ router------ wlan 1 (isakmpd)
|
| 172.16.2.0 (isakmpd)
| --------wlan 2
|
|
Internet
and this "fixed" the problem!
We have no access to the Cisco so I don't know it's configuration. But:
No lockups, no "rtfree" messages.
If the bug is still unresolved mid-January I can continue testing by
then. Thanks to all for your suggestions and help!
--per
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"