maximal_backoff_time not properly applied on a multi instance server

2018-12-12 Thread Brice Lopez
Hello,

I'm dealing with a strange issue on a postfix server running multiple 
instances. I've set a maximal_backoff_time of 8 hours in the instance's 
main.cf. I can check it was actually applied with postconf:

  $ sudo postmulti -i postfix-p26-o151 -x postconf | grep -E 
'queue_lifetime|backoff_time'
  bounce_queue_lifetime = 5d
  maximal_backoff_time = 8h
  maximal_queue_lifetime = 3d
  minimal_backoff_time = 10m

However, when I check the logs for this instance, I can find multiple 
occurences of deferred emails moved back to the active queue with the default 
maximal_backoff_time of 4000s. I've double checked the postfix docu
mentation, the main.cf file should be private for each instance. Is this a bug, 
or did I forget a parameter?

2018-12-12T01:06:34.384322+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781, nrcpt=1 (queue active)
2018-12-12T01:07:40.017722+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/smtp[6667]: C0E76483ECE: to=, 
relay=mx..AAA[DDD.DD.DD.DD]:25, delay=386893, delays=386828/65/0.37/0.1, 
dsn=4.7.1, status=deferred (host mx..AAA[DDD.DD.DD.DD] said: 450 4.7.1 Your 
email server DDD.DDD.DDD.DDD was temporarily blocked due to spam - contact us 
and report a problem (in reply to RCPT TO command))
2018-12-12T02:16:35.950068+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781, nrcpt=1 (queue active)
2018-12-12T02:17:37.859783+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/smtp[1443]: C0E76483ECE: to=, 
relay=mx..AAA[DDD.DD.DD.DD]:25, delay=391091, delays=391029/61/0.37/0.1, 
dsn=4.7.1, status=deferred (host mx..AAA[DDD.DD.DD.DD] said: 450 4.7.1 Your 
email server DDD.DDD.DDD.DDD was temporarily blocked due to spam - contact us 
and report a problem (in reply to RCPT TO command))
2018-12-12T03:26:35.447677+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781, nrcpt=1 (queue active)
2018-12-12T03:27:34.835921+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/smtp[31192]: C0E76483ECE: to=, 
relay=mx..AAA[DDD.DD.DD.DD]:25, delay=395288, delays=395229/59/0.37/0.11, 
dsn=4.7.1, status=deferred (host mx..AAA[DDD.DD.DD.DD] said: 450 4.7.1 Your 
email server DDD.DDD.DDD.DDD was temporarily blocked due to spam - contact us 
and report a problem (in reply to RCPT TO command))
2018-12-12T04:36:30.932122+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781, nrcpt=1 (queue active)
2018-12-12T04:37:41.922446+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/smtp[28194]: C0E76483ECE: to=, 
relay=mx..AAA[DDD.DD.DD.DD]:25, delay=399495, delays=399424/71/0.37/0.11, 
dsn=4.7.1, status=deferred (host mx..AAA[DDD.DD.DD.DD] said: 450 4.7.1 Your 
email server DDD.DDD.DDD.DDD was temporarily blocked due to spam - contact us 
and report a problem (in reply to RCPT TO command))
2018-12-12T05:46:31.881443+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781, nrcpt=1 (queue active)
2018-12-12T05:47:46.309076+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/smtp[25988]: C0E76483ECE: to=, 
relay=mx..AAA[DDD.DD.DD.DD]:25, delay=403700, delays=403625/74/0.37/0.11, 
dsn=4.7.1, status=deferred (host mx..AAA[DDD.DD.DD.DD] said: 450 4.7.1 Your 
email server DDD.DDD.DDD.DDD was temporarily blocked due to spam - contact us 
and report a problem (in reply to RCPT TO command))
2018-12-12T06:56:33.072730+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781, nrcpt=1 (queue active)
2018-12-12T06:57:54.902251+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/smtp[26065]: C0E76483ECE: to=, 
relay=mx..AAA[DDD.DD.DD.DD]:25, delay=407908, delays=407826/81/0.37/0.11, 
dsn=4.7.1, status=deferred (host mx..AAA[DDD.DD.DD.DD] said: 450 4.7.1 Your 
email server DDD.DDD.DDD.DDD was temporarily blocked due to spam - contact us 
and report a problem (in reply to RCPT TO command))
2018-12-12T08:06:34.682500+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781, nrcpt=1 (queue active)
2018-12-12T08:08:00.726228+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/smtp[28467]: C0E76483ECE: to=, 
relay=mx..AAA[DDD.DD.DD.DD]:25, delay=412114, delays=412028/86/0.37/0.11, 
dsn=4.7.1, status=deferred (host mx..AAA[DDD.DD.DD.DD] said: 450 4.7.1 Your 
email server DDD.DDD.DDD.DDD was temporarily blocked due to spam - contact us 
and report a problem (in reply to RCPT TO command))
2018-12-12T09:16:37.731205+00:00 pr-li-smtp00-0047 info 
postfix-p26-o151/qmgr[31295]: C0E76483ECE: 
from=,
 size=28781,

Regards,

Brice Lopez


Re: ignore SASL/Auth to specific server (internal exchange relay)

2018-12-12 Thread Viktor Dukhovni
> On Dec 12, 2018, at 1:36 AM, Stefan Bauer  wrote:
> 
> i already have a transport_maps in main.cf in place:
> transport_maps=hash:/etc/postfix/transport
> 
> domain1.deexchange:
> 
> How can i set another  transport_maps setting in main.cf as you recommend?

I never recommended "another transport_maps" definition, I recommended
a table *entry* that sends mail to the non-SASL relay via  a different
transport than mail to the relays that require SASL.  If you already
have that, then all you need to do is disable per-send SASL auth for
that transport.

-- 
Viktor.



Re: maximal_backoff_time not properly applied on a multi instance server

2018-12-12 Thread Viktor Dukhovni



> On Dec 12, 2018, at 7:24 AM, Brice Lopez  wrote:
> 
> I'm dealing with a strange issue on a postfix server running multiple 
> instances. I've set a maximal_backoff_time of 8 hours in the instance's 
> main.cf. I can check it was actually applied with postconf:
> 
>  $ sudo postmulti -i postfix-p26-o151 -x postconf | grep -E 
> 'queue_lifetime|backoff_time'
>  bounce_queue_lifetime = 5d
>  maximal_backoff_time = 8h
>  maximal_queue_lifetime = 3d
>  minimal_backoff_time = 10m

$ date
$ ls -l /etc/postfix-p26-o151/main.cf

As root:

# qdir=$(postmulti -i postfix-p26-o151 -x postconf -hx queue_directory)
# ps -o pid,etime,args -p $(pgrep -P $(cat $qdir/pid/master.pid) -x qmgr)

-- 
Viktor.



Re: maximal_backoff_time not properly applied on a multi instance server

2018-12-12 Thread Brice Lopez
On Wed, Dec 12, 2018 at 08:45:14AM -0500, Viktor Dukhovni wrote:

> $ date
> $ ls -l /etc/postfix-p26-o151/main.cf
> 
> As root:
> 
> # qdir=$(postmulti -i postfix-p26-o151 -x postconf -hx queue_directory)
> # ps -o pid,etime,args -p $(pgrep -P $(cat $qdir/pid/master.pid) -x qmgr)

$ grep multi_instance_directories /etc/postfix/main.cf
multi_instance_directories = /etc/postfix/pools/p25/o8 
/etc/postfix/pools/p25/o19 /etc/postfix/pools/p25/o30 
/etc/postfix/pools/p25/o41 /etc/postfix/pools/p25/o52 
/etc/postfix/pools/p25/o63 /etc/postfix/pools/p25/o74 
/etc/postfix/pools/p25/o85 /etc/postfix/pools/p25/o96 
/etc/postfix/pools/p25/o107 /etc/postfix/pools/p25/o118 
/etc/postfix/pools/p25/o129 /etc/postfix/pools/p25/o140 
/etc/postfix/pools/p25/o151 /etc/postfix/pools/p25/o162 
/etc/postfix/pools/p25/o173 /etc/postfix/pools/p26/o8 
/etc/postfix/pools/p26/o19 /etc/postfix/pools/p26/o30 
/etc/postfix/pools/p26/o41 /etc/postfix/pools/p26/o52 
/etc/postfix/pools/p26/o63 /etc/postfix/pools/p26/o74 
/etc/postfix/pools/p26/o85 /etc/postfix/pools/p26/o96 
/etc/postfix/pools/p26/o107 /etc/postfix/pools/p26/o118 
/etc/postfix/pools/p26/o129 /etc/postfix/pools/p26/o140 
/etc/postfix/pools/p26/o151 /etc/postfix/pools/p26/o162 
/etc/postfix/pools/p26/o173

blopez@pr-li-smtp00-0047:/var/log/custom/p26/backup$ grep -r postfix-p26-o151 
/etc/postfix/
/etc/postfix/pools/p26/o151/main.cf:multi_instance_name = postfix-p26-o151

$ date
Wed Dec 12 14:05:22 UTC 2018

$ ls -l /etc/postfix/pools/p26/o151/main.cf
-rw-r--r-- 1 root root 2842 Dec 10 10:58 /etc/postfix/pools/p26/o151/main.cf

$ qdir=$(sudo postmulti -i postfix-p26-o151 -x postconf -hx queue_directory)
$(pgrep -P $(sudo cat $qdir/pid/master.pid) -x qmgr)
  PID ELAPSED COMMAND
31295 120-17:05:56 qmgr -l -t fifo -u

The instance was reloaded a few times since it's running, and on the 10th of 
september after the configuation modification. I just restarted it just in case.

Regards,

> -- 
>   Viktor.
> 

-- 
Brice Lopez


Re: maximal_backoff_time not properly applied on a multi instance server

2018-12-12 Thread Viktor Dukhovni
> On Dec 12, 2018, at 9:15 AM, Brice Lopez  wrote:
> 
> $ date
> Wed Dec 12 14:05:22 UTC 2018
> 
> $ ls -l /etc/postfix/pools/p26/o151/main.cf
> -rw-r--r-- 1 root root 2842 Dec 10 10:58 /etc/postfix/pools/p26/o151/main.cf

Your main.cf file last changed 2 days ago.

> $ qdir=$(sudo postmulti -i postfix-p26-o151 -x postconf -hx queue_directory)
> $(pgrep -P $(sudo cat $qdir/pid/master.pid) -x qmgr)
>  PID ELAPSED COMMAND
> 31295 120-17:05:56 qmgr -l -t fifo -u

Your queue manager has been running for 120 days, without a "postfix reload".

> The instance was reloaded a few times since it's running,

But not more recently than 120 days ago.

> and on the 10th of September after the configuration modification.

Apparently not, as "postfix reload" does restart the queue manager,
so your attempts to "reload" have not been effective.

> I just restarted it just in case.

That should cause the new settings to take effect.

-- 
-- 
Viktor.

P.S. Please don't send unsolicited email to outlook.com or any
 other email mailboxes.

Re: ignore SASL/Auth to specific server (internal exchange relay)

2018-12-12 Thread Daniel Miller

Not wanting to get in the way of the experts but this may help:

An oversimplified view of the transport map is it tells Postfix what 
line in master.cf to use for a given recipient domain (or full 
address).  There's only one transport map but it can have several lines 
for individual decisions.


Wietse's email told you to perform a command-line test to verify your 
transport map is setup correctly.  So do that first.


The definitions in master.cf tell Postfix where to listen and where to 
send the message.  So with an explicit transport mapping, using 
master.cf you provide explicit overrides to the defaults or global 
settings from main.cf.  So if the only "special" behavior you need for 
the exchange transport is no sasl:


exchange  unix -       -       n       -       -       smtp
 -o smtp_sender_dependent_authentication=no

Daniel

On 12/11/2018 1:40 PM, Stefan Bauer wrote:

thank you for your help!

If i understood you correctly, i set in transport:

domain1.de                exchange:

In master.cf 

exchange  unix -       -       n       -       -  smtp
 -o smtp_sender_dependent_authentication=no
 -o transport_maps=hash:/etc/postfix/transport_internal

And in transport_internal

domain1.de            smtp:192.168.124.5:2525 



but this way, postfix is doing a MX-lookup for domain1.de 
 and not honoring transport_internal as it seems.


Is this basically the right path?


Am Di., 11. Dez. 2018 um 21:48 Uhr schrieb Viktor Dukhovni 
mailto:postfix-us...@dukhovni.org>>:


> On Dec 11, 2018, at 3:41 PM, Stefan Bauer
mailto:cubew...@googlemail.com>> wrote:
>
> Can you recommend appropriate manual(s)? I dont understand what
you mean with separate transport.

http://www.postfix.org/master.5.html
http://www.postfix.org/transport.5.html
http://www.postfix.org/ADDRESS_REWRITING_README.html
http://www.postfix.org/FILTER_README.html#advanced_filter
  ( Advanced content filter: sending unfiltered mail to the
content filter )

Also the Postfix book by Patrick Koetter and Ralf Hildebrandt.

-- 
        Viktor.




some users can't receive mails from domain.

2018-12-12 Thread Selcuk Yazar
Hi,
Some of our users can receive mail from national academic journal site but
a few can't. when i look the logs, it says;
..
 timeout after END-OF-MESSAGE from rs248.mailgun.us[209.61.151.248]
..
is it about our server or their server ?

thanks in advance
-- 
Selçuk YAZAR


Re: ignore SASL/Auth to specific server (internal exchange relay)

2018-12-12 Thread Viktor Dukhovni



> On Dec 12, 2018, at 2:48 PM, Daniel Miller  wrote:
> 
> Not wanting to get in the way of the experts but this may help:

Indeed a nice succinct and accessible answer for non-experts.  Please
don't hesitate to post similarly helpful replies.

> An oversimplified view of the transport map is it tells Postfix what line in 
> master.cf to use for a given recipient domain (or full address).  There's 
> only one transport map but it can have several lines for individual decisions.
> 
> Wietse's email told you to perform a command-line test to verify your 
> transport map is setup correctly.  So do that first.
> 
> The definitions in master.cf tell Postfix where to listen and where to send 
> the message.  So with an explicit transport mapping, using master.cf you 
> provide explicit overrides to the defaults or global settings from main.cf.  
> So if the only "special" behavior you need for the exchange transport is no 
> sasl: 
> 
> exchange  unix -   -   n   -   -   smtp
>  -o smtp_sender_dependent_authentication=no

-- 
Viktor.



Re: some users can't receive mails from domain.

2018-12-12 Thread Viktor Dukhovni
> On Dec 12, 2018, at 2:49 PM, Selcuk Yazar  wrote:
> 
> Hi,
> Some of our users can receive mail from national academic journal site but a 
> few can't. when i look the logs, it says;
> ..  
>  timeout after END-OF-MESSAGE from rs248.mailgun.us[209.61.151.248]
> ..
> is it about our server or their server ?
> 
> thanks in advance

You'll need a PCAP capture:

  http://www.postfix.org/DEBUG_README.html#sniffer

which you can analyze with wireshark, tshark, or similar.

Often this is indicative of path MTU issues, but the PCAP file
should tell all.

-- 
Viktor.



Re: maximal_backoff_time not properly applied on a multi instance server

2018-12-12 Thread Brice Lopez
On Wed, Dec 12, 2018 at 12:47:07PM -0500, Viktor Dukhovni wrote:
> 
> Apparently not, as "postfix reload" does restart the queue manager,
> so your attempts to "reload" have not been effective.
> 
> > I just restarted it just in case.
> 
> That should cause the new settings to take effect.

Indeed, it worked. It was a bug in the ansible playbook, it was confirmed by 
digging in the logs. Thank you for the quick troubleshooting.

Regards,

-- 
Brice Lopez


Re: some users can't receive mails from domain.

2018-12-12 Thread Wietse Venema
Selcuk Yazar:
> Hi,
> Some of our users can receive mail from national academic journal site but
> a few can't. when i look the logs, it says;
> ..
>  timeout after END-OF-MESSAGE from rs248.mailgun.us[209.61.151.248]
> ..
> is it about our server or their server ?

Probably none of that, i.e. suspect a network-level problem. These
tend to be caused by middleboxes (firewalls, traffic shapers, etc.)
that break path MTU discovery, TCP window scaling, both, or something
else. As Victor suggests, only a packet sniffer can determine the
cause. No packet content is needed, but TCP metadata is essential
(handshake, flags, ack, window size, etc.).

Wietse


Re: some users can't receive mails from domain.

2018-12-12 Thread Benny Pedersen

Selcuk Yazar skrev den 2018-12-12 20:49:


 timeout after END-OF-MESSAGE from rs248.mailgun.us
[209.61.151.248]
..
is it about our server or their server ?


i have prolems with them aswell, turned out have problems with ssl/tls

dont know how to solve it

openssl here have sslv2 and sslv3 disabled at compile time


Re: some users can't receive mails from domain.

2018-12-12 Thread Viktor Dukhovni
> On Dec 12, 2018, at 3:15 PM, Benny Pedersen  wrote:
> 
> I have problems with them as well, turned out have problems with ssl/tls

If you want the problem solved, start by posting non-verbose logs that
show the problem behaviour.  Don't get distracted by SSL handshake
failures from some broken clients, if the client then retries in
cleartext.  Some systems fail to implement opportunistic TLS sensibly.

SSL is unlikely to be a barrier mid-stream in data transfer, generally
once the handshake is over, encryption does not noticeably impede
data transfer.

-- 
-- 
Viktor.



Re: some users can't receive mails from domain.

2018-12-12 Thread Benny Pedersen

Viktor Dukhovni skrev den 2018-12-12 21:29:

On Dec 12, 2018, at 3:15 PM, Benny Pedersen  wrote:

I have problems with them as well, turned out have problems with 
ssl/tls


If you want the problem solved, start by posting non-verbose logs that
show the problem behaviour.  Don't get distracted by SSL handshake
failures from some broken clients, if the client then retries in
cleartext.  Some systems fail to implement opportunistic TLS sensibly.


Dec 11 09:58:53 localhost postfix/smtpd[24986]: connect from 
rs241.mailgun.us[209.61.151.241]
Dec 11 09:58:54 localhost postfix/smtpd[24986]: SSL_accept error from 
rs241.mailgun.us[209.61.151.241]: 0
Dec 11 09:58:54 localhost postfix/smtpd[24986]: lost connection after 
STARTTLS from rs241.mailgun.us[209.61.151.241]
Dec 11 09:58:54 localhost postfix/smtpd[24986]: disconnect from 
rs241.mailgun.us[209.61.151.241] ehlo=1 starttls=0/1 commands=1/2


more logs ?


Re: some users can't receive mails from domain.

2018-12-12 Thread Viktor Dukhovni
On Wed, Dec 12, 2018 at 10:16:14PM +0100, Benny Pedersen wrote:

> postfix/smtpd[24986]: connect from rs241.mailgun.us[209.61.151.241]
> postfix/smtpd[24986]: SSL_accept error from rs241.mailgun.us[209.61.151.241]: > 0
> postfix/smtpd[24986]: lost connection after STARTTLS from 
> rs241.mailgun.us[209.61.151.241]
> postfix/smtpd[24986]: disconnect from rs241.mailgun.us[209.61.151.241] ehlo=1 
> starttls=0/1 commands=1/2

As expected this is a handshake problem, but I would expect to see
additional log messages showing more detailed SSL library error
details.  For example, my logs have:

  postfix/smtpd[72804]: connect from 
sonic315-20.consmr.mail.ne1.yahoo.com[66.163.190.146]
  postfix/smtpd[72804]: SSL_accept error from 
sonic315-20.consmr.mail.ne1.yahoo.com[66.163.190.146]: -1
  postfix/smtpd[72804]: warning: TLS library problem: 
error:14094416:SSL routines:ssl3_read_bytes:sslv3 alert certificate unknown:
ssl/record/rec_layer_s3.c:1528:SSL alert number 46:
  postfix/smtpd[72804]: lost connection after STARTTLS from 
sonic315-20.consmr.mail.ne1.yahoo.com[66.163.190.146]

Which was supposed to have been fixed some time back, but Yahoo
have never quite gotten around to actually doing it.  Anyway, where's
your "TLS library problem" log message?  Perhaps this is a case
where the handshake fails at the TCP layer (the remote end simply
hangs up), in which case Postfix logging may not be as detailed as
it could be.  Here's a patch for 3.3.2, that may show more detail.

diff --git a/src/tls/tls_bio_ops.c b/src/tls/tls_bio_ops.c
index 1f4ec41f..c427a646 100644
--- a/src/tls/tls_bio_ops.c
+++ b/src/tls/tls_bio_ops.c
@@ -279,8 +279,10 @@ int tls_bio(int fd, int timeout, TLS_SESS_STATE 
*TLScontext,
case SSL_ERROR_ZERO_RETURN:
case SSL_ERROR_NONE:
errno = 0;  /* avoid bogus warnings */
-   /* FALLTHROUGH */
+   return (status);
case SSL_ERROR_SYSCALL:
+   if (hsfunc && errno != 0)
+   msg_warn("SSL handshake I/O error: %m");
return (status);
}
 }

-- 
Viktor.