El día jueves, noviembre 05, 2020 a las 02:05:38p. m. -0800, Kevin J. McCarthy escribió:
> On Thu, Nov 05, 2020 at 10:43:17PM +0100, Matthias Apitz wrote: > >And as I said, all is working fine, i.e. the mails get sent fine, the > >only problem is this message spilt out by mutt about mail not sent. > > > >I will nail this down, it will only take some time, and I feel that it > >has todo with the handling on SIGCHLD which is set by the application > >server which calls the master script perhaps to wrong state. Needs some > >more debugging. Mutt itself isn't the culprit perhaps. > > Okay, it definitely sounds like something beyond my ability to assist > deeply. :-) > > From Mutt's point of view, it is looking for the exit code after > waitpid() finishes. If there is an error from waitpid() or the > WEXITSTATUS is not 0, then Mutt will print that message out. Hello Kevin et all, I think I nailed it down. See the proc chain below. In send_msg() Mutt is waiting that the forked 'sendmail' process ends but when this happens the PID does not exist anymore and the wait(2) is returning -1, which in this case (ECHILD) should not be treated as an error. This, errno==ECHILD not beeing an error on Linux, was discussed already a lot of times, see for example here: https://stackoverflow.com/questions/55150189/linux-system-returns-1-errno-10-no-child-processes I think the code in sendlib.c line 2286 ... should be changed to something like this: diff -c sendlib.c.orig sendlib.c *** sendlib.c.orig Tue May 30 21:27:53 2017 --- sendlib.c Fri Nov 6 08:16:24 2020 *************** *** 2283,2289 **** } else { ! st = (SendmailWait > 0 && errno == EINTR && SigAlrm) ? S_BKG : S_ERR; if (SendmailWait > 0 && tempfile && *tempfile) { --- 2283,2289 ---- } else { ! st = ((SendmailWait > 0 && errno == EINTR && SigAlrm) || (errno == ECHILD)) ? S_BKG : S_ERR; if (SendmailWait > 0 && tempfile && *tempfile) { *************** *** 2310,2315 **** --- 2310,2317 ---- if (pid != -1 && waitpid (pid, &st, 0) > 0) st = WIFEXITED (st) ? WEXITSTATUS (st) : S_ERR; /* return child status */ + else if (errno == ECHILD) + st = S_BKG; else st = S_ERR; /* error */ st = ((SendmailWait > 0 && errno == EINTR && SigAlrm) | (errno == ECHILD)) ? S_BKG : S_ERR; For me it solved the problem fine. Here is the process chain (please use fixed font terminal): 10499 execve("/usr/local/sisis-pap/bin/mutt", ["/usr/local/sisis-pap/bin/mutt", "-d4", ... | | + 10499 clone(child_stack=NULL .... ) = 10502 | | + 10502 clone(child_stack=NULL, flags ...) = 10503 | | | + 10503 execve("sisis2mail.sh", ["sisis2mail.sh", "--cat", "--", "foo@zone"] ... | | | | | + 10504 execve("/bin/cat", ["cat", "--"], ... | | 10502 10502 wait4(10503, <unfinished ...> and then at the end: 10503 exit_group(0) = ? 10503 +++ exited with 0 +++ 10502 <... wait4 resumed> 0x7ffc8dc9137c, 0, NULL) = -1 ECHILD (No child processes) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 10502 alarm(0) = 0 10502 rt_sigaction(SIGALRM, {sa_handler=0xd76dd3, sa_mask=[], sa_flags=SA_RESTORER|0x200, sa_restorer=0x7f9762c035a0}, NULL, 8) = 0 10502 kill(10499, SIG_0) = 0 10502 exit_group(127) = ? ^^^^^^^^^^^^^^^^^^^^^ 10502 +++ exited with 127 +++ 10499 <... wait4 resumed> 0x7ffc8dc9137c, 0, NULL) = -1 ECHILD (No child processes) 10499 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0 10499 rt_sigaction(SIGQUIT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f9762c035a0}, NULL, 8) = 0 10499 rt_sigaction(SIGINT, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f9762c035a0}, NULL, 8) = 0 10499 write(2, "Fehler 127 beim Versand der Nachricht (Exec error.).", 52) = 52 10499 write(2, "\n", 1) = 1 10499 unlink("/tmp/mutt-srap38dxr1-900118-10499-18399221751636680389") = 0 10499 write(3, "[2020-11-05 09:51:33] ", 22) = 22 10499 write(3, "mutt_free_body: unlinking /tmp/mutt-srap38dxr1-900118-10499-18399221751636680389.\n", 82) = 82 10499 write(1, "Debugging auf Ebene 4.\nKonnte Nachricht nicht verschicken.\n", 59) = 59 10499 exit_group(1) = ? 10499 +++ exited with 1 +++ See also man page of wait(2): the errno=ECHILD: ECHILD (for waitpid() or waitid()) The process specified by pid (waitpid()) or idtype and id (waitid()) does not exist or is not a child of the calling process. (This can happen for one's own child if the action for SIGCHLD is set to SIG_IGN. See also the Linux Notes section about threads.) -- Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub Без книги нет знания, без знания нет коммунизма (Влaдимир Ильич Ленин) Without books no knowledge - without knowledge no communism (Vladimir Ilyich Lenin) Sin libros no hay saber - sin saber no hay comunismo. (Vladimir Ilich Lenin)