Hi there, I've been beating my head against a mysterious problem for some time now, and I'm hoping that you folks can help me out. When I run amanda, seven out of my 16 hosts don't respond. Of these, some are Solaris and some are FreeBSD 3.1-RELEASE, but it's the FreeBSD ones I'm concerned with at the moment. I'm using Amanda 2.4.1. (Note that the symptomology on the Solaris machines is different, which is why I'm posting this to -hackers.)
>From my experiments with amcheck and snoop, it looks like the amandad on the affected clients is doing OK for one connection, but then the amandad process sticks around as a zombie and apparently inetd won't spawn a new one until the old one dies. The only way to get rid of the zombie that I've been able to find (besides rebooting of course) is to kill inetd. (on my backup host) bash-2.02$ amcheck -c general (on a malfunctioning client) cache2# inetd -d ADD : amanda proto=udp accept=0 max=1 user=backup group=(null) class=daemon builtin=0x0 server=/usr/local/amanda/libexec/amandad inetd: enabling amanda, fd 4 inetd: registered /usr/local/amanda/libexec/amandad on 4 inetd: someone wants amanda inetd: inetd: disabling amanda, fd 4+ closing from 4 inetd: 6692 execl /usr/local/amanda/libexec/amandad Looks fine. But that's with a freshly started inetd -- the second time I run amcheck, I get no output from inetd. Now let's look at the truss output of inetd while I do an amcheck. This is with a fresh inetd. cache2# truss -p `cat /var/run/inetd.pid` syscall (null)() returns 1 (0x1) syscall sigprocmask(0x1,0x82001) returns 0 (0x0) syscall gettimeofday(0x80580ac,0x0) returns 0 (0x0) syscall fork() returns 6717 (0x1a3d) syscall sigprocmask(0x3,0x0) returns 532481 (0x82001) syscall sigprocmask(0x1,0x82001) returns 0 (0x0) SIGNAL 20 SIGNAL 20 SIGNAL 20 syscall sigsuspend(0x0) errno 4 'Interrupted system call' syscall write(5,0xefbfda3b,1) returns 1 (0x1) syscall sigreturn(0xefbfda64) errno 4 'Interrupted system call' And here it stays ... while I do another amcheck, even. It looks like SIGCHLD is being delivered, but I don't see any wait-type syscalls. Any thoughts, anyone? -- Ben UNIX Systems Engineer, Skunk Group StarMedia Network, Inc. To Unsubscribe: send mail to majord...@freebsd.org with "unsubscribe freebsd-hackers" in the body of the message