Edwin Whitelaw wrote:
I use the WRAP single board computer running Voyage Linux as wireless
routers for my business. A system that had been running for over a week
started locking up approximately once a day after enabling the DHCP
functionality of dnsmasq. When the lockups occurred, I was unable to
gain access through either radio (one in managed mode 5GHz, the other a
2.4GHz AP) or the ethernet port. I did not have access to the console
port so can't comment on what might have been displayed there. DHCP,
and everything else would work just fine up until the system froze. I
did not have remote logging enabled at the time but had three customers
so needed to get the problem in hand ASAP. I disabled the DHCP portion
of dnsmasq and loaded the ISC DHCP server to replace it.
The evidence is anecdotal but I have not had any problems so far for
over 60 hours, considerably longer than any previous uptime while using
dnsmsaq's DHCP. I am still using the DNS portion of dnsmasq with no
problems.
This post is not really a request for help as I will probably keep the
current setup but it might help someone else track down a problem. On
the other hand, it would be nice to have the dnsmasq DHCP functionality
as an option going forward.
I'll provide more detailed system info if needed.
---
riner:~# uname -a
Linux riner 2.6.14-486-voyage #1 PREEMPT Wed Nov 2 18:14:20 GMT 2005
i586 GNU/Linux
---
I've not seen any other reports that could shed light on this problem.
How hard is the box locking up? If it's really not responding at all,
then that must be a hardware problem or kernel bug. (Though it might
well be triggered by dnsmasq.)
Just a thought: are you relying on this box for the DNS service to the
machine you are logging in _from_. Maybe the kernel is still up, but
dnsmasq crashed - hence no DNS. Also is the suspect box relying on its
own dnsmasq for DNS (ie nameserver 127.0.0.1 in /etc/resolv.conf)
failure to do reverse lookups can wedge sshd for some time and cause
strange effects.
On the other hand, if even ping to the dotted-quad IP address fails,
then the box real has crashed. dnsmasq uses very different syscalls than
ISC dhcpd to do low-level network access, so kernel problems with
those are the obvious first line of attack. One thing you could do is
re-compile without RTnetlink support: that would eliminate one difference.
Cheers,
Simon.