On Tue, Dec 16, 2008 at 12:05:40PM -0700, Jes?s Manuel Loaiza Vidal wrote: > Here are the log and pcap file with the patch: > > Mail log <http://www.ich.edu.mx/attachments/postfix-3.txt> > PCAP file <http://www.ich.edu.mx/attachments/tcp-3.cap> >
Thanks, this is very useful, we are much closer to finding the root cause: Dec 16 11:48:13 [postfix/smtpd] TLS I/O status -1 error 1 Dec 16 11:48:13 [postfix/smtpd] Read -1 chars Dec 16 11:48:13 [postfix/smtpd] smtp_get: EOF > >+ if (TLScontext->log_level >= 4) > >+ msg_info("TLS I/O status %d error %d", status, err); What the new logs show is that SSL_read() fails with error (openssl/ssl.h): #define SSL_ERROR_SSL 1 and what's more this happens without any attempt to read the network socket. Normally, when SSL runs out of data it fails with SSL_ERROR_WANT_READ (2), which triggers Postfix to perform physical I/O to fill the network side of the biopair. Here there the normal pattern of Dec 16 11:48:13 [postfix/smtpd] read from 09F05160 [09F19710] (5 bytes => -1 (0xFFFFFFFF)) Dec 16 11:48:13 [postfix/smtpd] TLS I/O status -1 error 2 Dec 16 11:48:13 [postfix/smtpd] network_biopair_interop fd=14 want_read=5 Dec 16 11:48:13 [postfix/smtpd] read from 09F05160 [09F19710] (5 bytes => 5 (0x5)) Dec 16 11:48:13 [postfix/smtpd] 0000 17 03 01 00 30 ....0 Dec 16 11:48:13 [postfix/smtpd] read from 09F05160 [09F19715] (48 bytes => -1 (0xFFFFFFFF)) Dec 16 11:48:13 [postfix/smtpd] TLS I/O status -1 error 2 Dec 16 11:48:13 [postfix/smtpd] network_biopair_interop fd=14 want_read=48 Dec 16 11:48:13 [postfix/smtpd] read from 09F05160 [09F19715] (48 bytes => 48 (0x30)) This is not supposed to happen. The SSL library fails without any attempt to read new data, so the internal library state is somehow corrupted. We need to log the error detail from the SSL library to see what the library is unhappy about. This sure looks like an SSL library bug, compiler bug or hardware issue. The kernel is no longer suspect for now. In addition to the previous patch, please also apply: --- src/tls/tls_bio_ops.c 2008-12-16 15:12:12.000000000 -0500 +++ src/tls/tls_bio_ops.c 2008-12-16 15:12:30.000000000 -0500 @@ -345,6 +345,10 @@ return (-1); /* network read/write error */ } break; + case SSL_ERROR_SSL: + if (hsfunc == 0) + tls_print_errors(); + /* FALLTHROUGH */ default: retval = status; done = 1; and report the results (logs). -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the "Reply-To" header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: <mailto:majord...@postfix.org?body=unsubscribe%20postfix-users> If my response solves your problem, the best way to thank me is to not send an "it worked, thanks" follow-up. If you must respond, please put "It worked, thanks" in the "Subject" so I can delete these quickly.