Re: Delay in qmgr

Viktor Dukhovni Fri, 14 Jun 2019 11:12:45 -0700

On Fri, Jun 14, 2019 at 03:17:44PM +0300, anzelmooo . wrote:

> Jun 14 09:36:26 mx1 postfix/smtp[6409]: 488845290: to=<us...@onmicrosoft.com>,
>   relay=host.mail.protection.outlook.com[ip.ip.ip.ip]:25, delay=1.7,
>   delays=0.01/0/0.34/1.4, dsn=2.6.0, status=sent (250 2.6.0 <msgid2>
>   [InternalId=1234, Hostname=host.prod.outlook.com] 111697 bytes in 0.722,
>   151.051 KB/sec Queued mail for delivery)
> Jun 14 09:41:26 mx1 postfix/qmgr[2699]: 488845290: removed

Note the 5 minute (300s) delay.

On Fri, Jun 14, 2019 at 09:25:01AM -0400, Wietse Venema wrote:

> If you want to debug this, you can log the SMTP session (debug_peer_list
> = example.com) and see if the remote server delays its QUIT response.
> That could explain the 5s delay.

That was 5 minutes, and given the Postfix default:

    smtp_quit_timeout = 300s

the QUIT hypothesis is in the right ballpark.

> Such delays should not be a problem with the default setting
> "smtp_skip_quit_response = yes", but maybe RedHat has changed that.

But I doubt Redhat changed this.  Another plausible hypothesis is:

    smtp_starttls_timeout = 300s

which 2.6 also applies to the 2-way TLS session shutdown.  Perhaps
after "QUIT", the TCP session appears to stay up with no close
notify from the server before the timeout.

There may be relevant changes in the RedHat SSL stack.  It would
be good to see the TLS log messages (which don't have the queue-id,
but have the same process-id as the smtp delivery agent) and are
logged shortly before delivery success.  This would confirm use
of TLS and protocol details.

By the way, already in RF2246 (TLS 1.0), there is no requirement
to wait for the peer's close-notify.  If the application protocol
provides sufficient session termination signalling, then there's
no need to duplicate that at the TLS close-notify layer.

   https://tools.ietf.org/html/rfc2246#section-7.2.1
   https://tools.ietf.org/html/rfc4346#section-7.2.1
   https://tools.ietf.org/html/rfc5246#section-7.2.1

   Unless some other fatal alert has been transmitted, each party
   is required to send a close_notify alert before closing the write
   side of the connection.  The other party MUST respond with a
   close_notify alert of its own and close down the connection
   immediately, discarding any pending writes.  It is not required
   for the initiator of the close to wait for the responding
   close_notify alert before closing the read side of the connection.

Therefore, we should probably eliminate the second tls_bio_shutdown()
call from tls_session_stop().

-- 
        Viktor.

--- src/tls/tls_session.c
+++ src/tls/tls_session.c
@@ -95,18 +95,13 @@ void    tls_session_stop(TLS_APPL_STATE *unused_ctx, 
VSTREAM *stream, int timeou
        msg_panic("%s: stream has no active TLS context", myname);

     /*
-     * Perform SSL_shutdown() twice, as the first attempt will send out the
-     * shutdown alert but it will not wait for the peer's shutdown alert.
-     * Therefore, when we are the first party to send the alert, we must call
-     * SSL_shutdown() again. On failure we don't want to resume the session,
-     * so we will not perform SSL_shutdown() and the session will be removed
-     * as being bad.
+     * Perform SSL_shutdown(), sending out the shutdown alert without waiting
+     * for the peer's shutdown alert.  On failure we don't want to resume the
+     * session, so we will not perform SSL_shutdown() and the session will be
+     * removed as being bad.
      */
-    if (!failure) {
-       retval = tls_bio_shutdown(vstream_fileno(stream), timeout, TLScontext);
-       if (retval == 0)
-           tls_bio_shutdown(vstream_fileno(stream), timeout, TLScontext);
-    }
+    if (!failure)
+       (void) tls_bio_shutdown(vstream_fileno(stream), timeout, TLScontext);
     tls_free_context(TLScontext);
     tls_stream_stop(stream);
 }

Re: Delay in qmgr

Reply via email to