>>>>> "Brian" == Brian May <[EMAIL PROTECTED]> writes:

    Brian> Thats not the way I grok the strace log:

    Brian> 4266 unlink("/var/run/amavis/amavisd.pid" <unfinished ...>
    Brian> 12868 <... rt_sigaction resumed> {SIG_DFL}, 8) = 0 20650
    Brian> rt_sigaction(SIGCHLD, {SIG_DFL}, <unfinished ...> 4266
    Brian> <... unlink resumed> ) = 0

    Brian> So it would appear that 4266 is deleting the PID file
    Brian> before it exits.

    Brian> However I am getting lost in a maze of PIDs, will continue
    Brian> investigating later.  -- Brian May <[EMAIL PROTECTED]>

here is a vague process tree in a format that I (tm) understand:


31119 execve("/etc/init.d/amavis", ["/etc/init.d/amavis", "start"], [/* 22 vars 
*/]) = 0

----> 5148

31119 --- SIGCHLD (Child exited) @ 0 (0) ---

----> 28681 execve("/bin/chown", ["chown", "-c", "-h", "amavis:amavis", 
"/var/run/amavis"], [/* 20 vars */]) = 0

31119 --- SIGCHLD (Child exited) @ 0 (0) ---

----> 14384 execve("/bin/chmod", ["chmod", "-c", "755", "/var/run/amavis"], [/* 
20 vars */]) = 0

31119 --- SIGCHLD (Child exited) @ 0 (0) ---


----> 5948 execve("/sbin/start-stop-daemon", ["start-stop-daemon", "--start", 
"--quiet", "--pidfile", "/var/run/amavis/amavisd.pid", "--name", "amavisd-new", 
"--startas", "/usr/sbin/amavisd-new", "--", "-P", "/var/
run/amavis/amavisd.pid", "start"], [/* 20 vars */]) = 0

      ---> 27723 execve("/bin/sh", ["sh", "-c", "run-parts --list 
\"/usr/share/ama"...], [/* 20 vars */] <unfinished ...>

      5948  --- SIGCHLD (Child exited) @ 0 (0) ---

      ---> 27966 execve("/bin/sh", ["sh", "-c", "run-parts --list 
\"/etc/amavis/co"...], [/* 20 vars */] <unfinished ...>

      5948  --- SIGCHLD (Child exited) @ 0 (0) ---

      ---> 17788 execve("/usr/bin/head", ["head", "-n", "1", "/etc/mailname"], 
[/* 21 vars */] <unfinished ...>

      5948  --- SIGCHLD (Child exited) @ 0 (0) ---

      ---> 23935 execve("/bin/hostname", ["hostname", "--fqdn"], [/* 21 vars 
*/] <unfinished ...>

      5948  --- SIGCHLD (Child exited) @ 0 (0) ---

      ---> 25027 continues

      5948 exit_group(0)

31119 --- SIGCHLD (Child exited) @ 0 (0) ---
31119 wait4(-1, 0x7bc972e62554, WNOHANG, NULL) = -1 ECHILD (No child processes)

           25027 continues

3119 exit_group(0)

           25027 continues

           ----> 4266 continues

           25027 continues

                 4266  --- SIGPIPE (Broken pipe) @ 0 (0) ---

                 ---> 15290 continues

                      15290 execve("/usr/bin/dccproc", ["/usr/bin/dccproc", 
"-H", "-x", "0"], [/* 23 vars */]) = 0

                 4266  --- SIGCHLD (Child exited) @ 0 (0) ---

                 ---> 20650 continues

                 ---> 12868 continues

           25027 --- SIGALRM (Alarm clock) @ 0 (0) ---

           25027 kill(4266, SIGTERM)               = 0

           25027 wait4(4266,  <unfinished ...>

                 4266  --- SIGTERM (Terminated) @ 0 (0) ---

                 4266 kill(12868, SIGTERM)              = 0

                 4266 kill(20650, SIGTERM <unfinished ...>

                      12868 --- SIGTERM (Terminated) @ 0 (0) ---

                      20650 --- SIGTERM (Terminated) @ 0 (0) ---
...

                      12868 exit_group(0)

                 4266  --- SIGCHLD (Child exited) @ 0 (0) ---

                 4266 unlink("/var/run/amavis/amavisd.pid" <unfinished ...>
                 
                      20650 exit_group(0)

                 4266 exit_group(0)

           25027 --- SIGCHLD (Child exited) @ 0 (0) ---

           ----> 12365

                 12365 --- SIGPIPE (Broken pipe) @ 0 (0) ---

                 -----> 15895

                 -----> 32317

           25027 --- SIGALRM (Alarm clock) @ 0 (0) ---

           25027 kill(12365, SIGTERM)              = 0
         
           25027 wait4(12365,  <unfinished ...>

                 12365 --- SIGTERM (Terminated) @ 0 (0) ---

                 12365 kill(32317, SIGTERM)              = 0

                 12365 kill(15895, SIGTERM <unfinished ...>

                        32317 --- SIGTERM (Terminated) @ 0 (0) ---

                        15895 --- SIGTERM (Terminated) @ 0 (0) ---

           ----> 12365 exit_group(0)

           25027 --- SIGCHLD (Child exited) @ 0 (0) ---

           ----> 7431

                 7431  --- SIGPIPE (Broken pipe) @ 0 (0) ---

           ----> 2578

                 2578  --- SIGPIPE (Broken pipe) @ 0 (0) ---

                        32317 exit_group(0)                     = ?
                        15895 exit_group(0)                     = ?

           25027 --- SIGTERM (Terminated) @ 0 (0) ---

           25027 kill(7431, SIGTERM)               = 0

           25027 kill(2578, SIGTERM <unfinished ...>

                 7431  --- SIGTERM (Terminated) @ 0 (0) ---

                 2578  --- SIGTERM (Terminated) @ 0 (0) ---

           25027 wait4(-1,  <unfinished ...>

                 7431  exit_group(0)                     = ?

                 2578  exit_group(0)                     = ?

For some reason:

25027 receives a SIGALRM, and as a result, kills its child. The child,
before it dies, kills its children.

25027 then creates two more children, and kills them in the same way.

25027 then forks two more children. I am guessing this is some sort of
failsafe mode???

Finding out why 25027 is killing its children is probably the key
issue here.
-- 
Brian May <[EMAIL PROTECTED]>


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to