OK - It did it again. named locked up - wait chan was select. But I was able to kill the process this time and restart it. However I was still not able to do any query's. I added a quick ipfw add 1 allow ip from any to any and that solved the query problem. I then proceeded to inspect my ipfw rules. All outbound dnsquery's are using the following rule:
allow udp from any to any dst-port 53 keep-state.
I then tried to utilize some of my other keep-state rules with no luck. It would seem as if the firewall stack simply doesn't want to do stateful after a while. I also tried flushing all the rules and reloading them - that still did not work. I can live for today with out stateful, so if anyone can help me with it today/tpnight troubleshooting that would be great. I don't want to reboot the machine until somebody can help me diagnose the problem - especially since I'm running what is going to be 6.2-RELEASE.

Looking back at the mailing list - I see that there was a change to ipfw.c that deals with dynamic rule timeout, perhaps this is to blame?

I am willing to give ssh access to debug this problem.

-Jon

Robert Watson wrote:


On Fri, 13 Oct 2006, Jonathan Feally wrote:

I have a P4 2.8 box running on an intel MB with a em0 acting as a firewall. The em0 has multiple tagged vlans on it, no ip assigned to main interface. Almost clockwork now, 6-7 days after bootup named or dhcpd completly locks up. I can't even kill -9 the apps. I have recompiled both apps since upgrading. I have only made two changes to this system around the same time. 1. Removed 2nd em nic that had only 1 network connected not vlan tagged. 2. Upgraded to 6.2-PRE

Has anyone else had these problems? I am going to try running the system with the internet connection not tagged to see if that helps.


I've not seen this on any boxes.  The usual debugging path here is to:

(1) Look at the process wait channel in ps axl.

(2) Compile KDB/DDB into the kernel, and do a kernel stack trace of the
    process.

Once you know what the kernel thread associated with the process is doing, we can attempt to figure out why it's doing it.

Robert N M Watson
Computer Laboratory
University of Cambridge




_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to