Mats Luspa: > Hello! > > Thank you for the exhausting explanation of the problem. > > Here you got the requested information about the system: > root@outgoingmail-2:~# uname -a > Linux outgoingmail-2 3.16.0-29-generic #39-Ubuntu SMP Mon Dec 15 > 22:27:29 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
What is the name of the Ubuntu distribution? Can you test mail to a non-existent address using a non-container MTA? Containers involve a complex blend of resource management tweaks in order to achieve isolation. I just tested Postfix 2.11.4 on a 3.17 kernel (Fedora Core 20) and the bounce service works as expected. Feb 14 13:45:44 fc20 postfix/smtp[8288]: 29891200686: to=<sjklhs...@porcupine.org>, relay=spike.porcupine.org[168.100.189.2]:25, delay=116, delays=110/0/5.9/0.04, dsn=5.1.1, status=bounced (host spike.porcupine.org[168.100.189.2] said: 550 5.1.1 <sjklhs...@porcupine.org>: Recipient address rejected: User unknown (in reply to RCPT TO command)) Feb 14 13:45:44 fc20 postfix/cleanup[8286]: 27A8420068D: message-id=<20150214184544.27a84200...@fc20.porcupine.org> Feb 14 13:45:44 fc20 postfix/bounce[8289]: 29891200686: sender non-delivery notification: 27A8420068D Feb 14 13:45:44 fc20 postfix/qmgr[8285]: 27A8420068D: from=<>, size=2466, nrcpt=1 (queue active) Feb 14 13:45:44 fc20 postfix/qmgr[8285]: 29891200686: removed Feb 14 13:45:44 fc20 postfix/local[8290]: 27A8420068D: to=<wie...@fc20.porcupine.org>, relay=local, delay=0.07, delays=0.02/0.01/0/0.04, dsn=2.0.0, status=sent (delivered to mailbox) Feb 14 13:45:44 fc20 postfix/qmgr[8285]: 27A8420068D: removed Wietse > root@outgoingmail-2:~# postconf mail_version > mail_version = 2.11.1 > > I've installed postfix from the Ubuntu utopic package. > > More insformation about the system is that it is a linux container. > But I don't think that has any effect to the software. > > It's no problem for me to let you get access to the system to debug if > you would prefer that. If you are familiar with linux-containers I > could also make a clone and you could get it. > I could also get the source code from the package, compile and add the > debugging code that you mentioned in the mail. > > Mail me what method that you want to use. > > /Regards Mats > > Quoting Wietse Venema <wie...@porcupine.org>: > > > Thanks for the "smtp -v/relay -v" logging. Your logging confirms > > that there is a bogus error talking to your bounce daemon. > > > > Although Postfix detects the bogus error, unfortunately it produces > > no informative logging for this particular error. > > > > Questions: > > > > - What is the output from "uname -a"? Postfix has bug workarounds > > for MacOSX and Solaris, but maybe they improved their software and > > we need to add a new workaround. > > > > - What is the output from "postconf mail_version"? maybe you are > > running a Postfix version that does not have all the known OS bug > > workarounds. > > > > - Did you build Postfix from source code, or is this a binary > > distribution? If you built from source I will ask you to modify the > > source for better logging. If you installed a binary distribution, > > I can set up a VM with the same OS and Postfix distribution and do > > the experiments myself. > > > > The remainder of this email is annotated logging that shows problems > > talking to the bounce daemon, in both the Postfix SMTP client and > > in the queue manager. The logging also illustrates that is a bogus > > error because the bounce service requests actually succeed. > > > > First, the Postfix SMTP client receives a 550 reply to an RCPT TO command, > > and attempts to report the delivery error to the bounce service (this > > appends the error report to a file /var/spool/postfix/48E68BF5F). > > > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: < > > mail.telia.com[62.20.233.128]:25: 550 RCPT > > TO:<nonexistingm...@telia.com> User unknown > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > connect to subsystem private/bounce > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr nrequest = 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr flags = 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr queue_id = 48E68BF5F > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr original_recipient = nonexistingm...@telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr recipient = nonexistingm...@telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr offset = 751 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr dsn_orig_rcpt = rfc822;nonexistingm...@telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr notify_flags = 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr status = 5.0.0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr diag_type = smtp > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr diag_text = 550 RCPT TO:<nonexistingm...@telia.com> User unknown > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr mta_type = dns > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr mta_mname = mail.telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr action = failed > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr reason = host mail.telia.com[62.20.233.128] said: 550 RCPT > > TO:<nonexistingm...@telia.com> User unknown (in reply to RCPT TO > > command) > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > connect to subsystem private/defer > > > > After "send attr reason" the bounce daemon is supposed to reply > > with a status (success or failure). But the logging shows that the > > Postfix SMTP client does not attempt to receive that response. > > Instead it has decided that the bounce request could not be sent, > > and makes a connection to the defer service to report that error. > > > > As we will see later the bounce request did actually succeed. > > > > Here is the function that sends the bounce request and receives the > > status reply. > > > > int mail_command_client(const char *class, const char *name,...) > > { > > va_list ap; > > VSTREAM *stream; > > int status; > > > > /* > > * Talk a little protocol with the specified service. > > */ > > if ((stream = mail_connect(class, name, BLOCKING)) == 0) > > return (-1); > > va_start(ap, name); > > status = attr_vprint(stream, ATTR_FLAG_NONE, ap); > > va_end(ap); > > if (status != 0 > > || attr_scan(stream, ATTR_FLAG_STRICT, > > RECV_ATTR_INT(MAIL_ATTR_STATUS, &status), 0) != 1) > > status = -1; > > (void) vstream_fclose(stream); > > return (status); > > } > > > > Apparently, attr_vprint (an alias for attr_vprint0) returns that > > the request could not be sent, OR attr_scan (an alias for attr_scan0) > > fails to receive the reply. > > > > Here is the attr_vprint0 code fragment: > > > > int attr_vprint0(VSTREAM *fp, int flags, va_list ap) > > { > > ... > > /* > > * Iterate over all (type, name, value) triples, and produce output on > > * the fly. > > */ > > while ((attr_type = va_arg(ap, int)) != ATTR_TYPE_END) { > > switch (attr_type) { > > ... code that logs "send attr xxx = yyy" and that uses > > vstream_fprintf() or vstream_fwrite() depending on the > > attribute type... > > } > > if ((flags & ATTR_FLAG_MORE) == 0) > > VSTREAM_PUTC('\0', fp); > > return (vstream_ferror(fp)); > > } > > > > The vstream_ferror() at the end implements the error check on exit > > from the attr_vprint0 function. > > > > Now it is extremely unlikely that vstream_ferror() reports an error > > on a newly-opened stream that has only a few bytes queued for output > > and that has never been flushed. > > > > So let's have a look at attr_scan0() and see if this function > > can return without trying to read any input. > > > > int attr_vscan0(VSTREAM *fp, int flags, va_list ap) > > { > > ...declarations... > > > > /* > > * EOF check. > > */ > > if ((ch = VSTREAM_GETC(fp)) == VSTREAM_EOF) > > return (0); > > vstream_ungetc(fp, ch); > > > > ...after this comes code that logs "wanted attribute: <name of > > attribute>" and that attempts to receive the attribute, and > > reports "premature EOF" or other read errors... > > > > So the more likely explanation is that VSTREAM_GETC() reports an > > error. That error happens either as the queued output with bounce > > request is auto-flushed, or it waits for the socket to become > > readable. > > > > This activity is logged with "-v -v -v", but that may actually slow > > down the program enough to affect kernel bug race-condition errors > > and change the result, just like program tracing does. > > > > At this point I am pretty sure that we are looking at a kernel bug > > with UNIX-domain sockets, but I don't want to debug kernel bugs > > over email. I can debug a system that runs in a local VM. > > > > Debugging should be straightforward: mail_command_client() is used > > only for bounce daemon requests (including defer and trace requests) > > so this function can be ripped apart, and the fault will be with > > one of the following steps: > > > > write-wait for the socket to become writable > > > > write the queued bounce server request > > > > read-wait for the socket to become readable > > > > The remainder of the log shows that the bounce request succeeded anyway. > > > > First, the defer service request to report the bounce service error: > > > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > connect to subsystem private/defer > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr nrequest = 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr flags = 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr queue_id = 48E68BF5F > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr original_recipient = nonexistingm...@telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr recipient = nonexistingm...@telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr offset = 751 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr dsn_orig_rcpt = rfc822;nonexistingm...@telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr notify_flags = 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr status = 4.3.0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr diag_type = smtp > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr diag_text = 550 RCPT TO:<nonexistingm...@telia.com> User unknown > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr mta_type = dns > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr mta_mname = mail.telia.com > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr action = delayed > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr reason = bounce or trace service failure > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > 48E68BF5F: to=<nonexistingm...@telia.com>, > > relay=mail.telia.com[62.20.233.128]:25, delay=5.3, > > delays=0.16/0/0.06/5.1, dsn=4.3.0, status=deferred (bounce or trace > > service failure) > > > > Then some code to update the fast flush log if applicable (it is not): > > > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > flush_add: site telia.com id 48E68BF5F > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > match_hostname: telia.com ~? irf.se > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > match_list_match: telia.com: no match > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > flush_add: site telia.com id 48E68BF5F status 4 > > > > The last part of the outbound SMTP session: > > > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > > mail.telia.com[62.20.233.128]:25: RSET > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > smtp_stream_setup: maxtime=20 enable_deadline=0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: < > > mail.telia.com[62.20.233.128]:25: 250 RSET > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > > mail.telia.com[62.20.233.128]:25: QUIT > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > name_mask: resource > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > name_mask: software > > > > The report from SMTP client to queue manager that delivery was deferred: > > > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > deliver_request_final: send: "" -1 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr status = > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr diag_type = > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr diag_text = > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr mta_type = > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr mta_mname = > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr action = > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr reason = > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr status = 4294967295 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > master_notify: status 1 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > connection closed > > > > And that is the end of the SMTP client's attempt to deliver the > > message with queue ID 48E68BF5F. > > > > And now for the surprise: the bounce daemon did record the request > > to bounce the message with queue ID 48E68BF5F. The queue manager > > sees that this file exists and asks the bounce daemon to create a > > non-delivery notification for the message with queue ID 87282BF62. > > > > And although the queue manager reports an error talking to the > > bounce service, > > the bounce daemon creates a delivery status notification with queue > > ID 87282BF62. > > > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/cleanup[12867]: > > 87282BF62: message-id=<20150214081828.87282b...@outgoingmail-2.irf.se> > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/bounce[12876]: > > 48E68BF5F: sender non-delivery notification: 87282BF62 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/qmgr[12816]: > > 87282BF62: from=<>, size=2552, nrcpt=1 (queue active) > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/qmgr[12816]: > > 48E68BF5F: status=deferred (bounce failed) > > > > The next fragment is the SMTP client delivering the delivery status > > notification with queue ID 87282BF62. > > > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > connection established > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > master_notify: status 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: > > deliver_request_initial: send initial status > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: send > > attr status = 0 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: smtp > > socket: wanted attribute: flags > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input > > attribute name: flags > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input > > attribute value: 3 > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: smtp > > socket: wanted attribute: queue_name > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input > > attribute name: queue_name > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input > > attribute value: active > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: smtp > > socket: wanted attribute: queue_id > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input > > attribute name: queue_id > > 2015-02-14T09:18:28+02:00 outgoingmail-2 postfix/smtp[12869]: input > > attribute value: 87282BF62 > > > > This is then delivered. > > > > > > -- > Mats Luspa > Phone: +46 (0)980 79 022 > Institutet f?r rymdfysik Fax: +46 (0)980 79 050 > Swedish Institute of Space Physics email: ma...@irf.se > Visiting/Delivery address: Rymdcampus 1, SE-981 92 Kiruna > Postal address: Box 812, SE-981 28 Kiruna > -- > PGP Public Key: http://www.irf.se/~matsl/pgpmatsl.asc > >