TLS Handshake Problems

2017-11-28 Thread Nikolaos Milas

Hello,

I have just started using in production a mail server running Postfix 
3.2.4 on CentOS 7.4 (fully patched) with openssl 1.0.2k.


This is a new server, replacing an old CentOS 5.11 with Postfix 2.6.11 
and OpenSSL 0.9.8e.


On the new server I see errors on particular servers as follows; the 
mail is finally delivered but with delay, after being deferred:


Nov 28 12:55:43 vmail2 postfix/submission/smtpd[31782]: 4623A80004F2F: 
client=zeus.admin.noa.gr[195.251.204.18], sasl_method=login, 
sasl_username=elke.rescom
Nov 28 12:55:43 vmail2 postfix/cleanup[30388]: 4623A80004F2F: 
message-id=<20171128105543.4623a80004...@vmail2.noa.gr>
Nov 28 12:55:43 vmail2 opendkim[4708]: 4623A80004F2F: DKIM-Signature 
field added (s=default, d=noa.gr)
Nov 28 12:55:43 vmail2 postfix/qmgr[6529]: 4623A80004F2F: 
from=, size=747, nrcpt=1 (queue active)
Nov 28 12:55:43 vmail2 postfix/smtp[782]: SSL_connect error to 
rcs12.rc.auth.gr[155.207.51.12]:25: lost connection
Nov 28 12:55:43 vmail2 postfix/smtp[782]: 4623A80004F2F: Cannot start 
TLS: handshake failure
Nov 28 12:55:43 vmail2 postfix/smtp[782]: SSL_connect error to 
rcs13.rc.auth.gr[155.207.144.13]:25: lost connection
Nov 28 12:55:43 vmail2 postfix/smtp[782]: 4623A80004F2F: Cannot start 
TLS: handshake failure
Nov 28 12:55:44 vmail2 postfix/smtp[782]: 4623A80004F2F: host 
mailsrv1.ccf.auth.gr[155.207.1.1] said: 450 4.7.1 try again later (in 
reply to DATA command)
Nov 28 12:55:45 vmail2 postfix/smtp[782]: 4623A80004F2F: 
to=, relay=mailsrv2.ccf.auth.gr[155.207.1.2]:25, 
delay=2.4, delays=0.096/0.001/1
.8/0.52, dsn=4.7.1, status=deferred (host 
mailsrv2.ccf.auth.gr[155.207.1.2] said: 450 4.7.1 try again later (in 
reply to DATA command))

...
Nov 28 13:01:22 vmail2 postfix/qmgr[6529]: 4623A80004F2F: 
from=, size=747, nrcpt=1 (queue active)
Nov 28 13:01:22 vmail2 postfix/smtp[782]: SSL_connect error to 
rcs12.rc.auth.gr[155.207.51.12]:25: lost connection
Nov 28 13:01:22 vmail2 postfix/smtp[782]: 4623A80004F2F: Cannot start 
TLS: handshake failure
Nov 28 13:01:22 vmail2 postfix/smtp[782]: 4623A80004F2F: 
to=, relay=rcs12.rc.auth.gr[155.207.51.12]:25, 
delay=340, delays=339/0.002/0.15/0.35, dsn=2.6.0, status=sent (250 2.6.0 
<20171128105543.4623a80004...@vmail2.noa.gr> Queued mail for delivery)

Nov 28 13:01:22 vmail2 postfix/qmgr[6529]: 4623A80004F2F: removed

On the older server, I didn't see such errors.

Can you please help me understand why this happens and if I can resolve 
it by using specific settings?


Here is postconf -n:

# postconf -n
alias_database = hash:/etc/postfix/aliases, 
hash:/etc/postfix/aliases.d/virtual_aliases

alias_maps = hash:/etc/aliases
allowed_gein = check_client_access 
cidr:/etc/postfix/gein_admin_ips.cidr,reject
allowed_iaasars = check_client_access 
cidr:/etc/postfix/iaasars_admin_ips.cidr,reject
allowed_list1 = check_sasl_access 
hash:/etc/postfix/allowed_groupmail_users,reject

allowed_list2 = permit_sasl_authenticated,reject
allowed_meteo = check_client_access 
cidr:/etc/postfix/meteo_admin_ips.cidr,reject

broken_sasl_auth_clients = yes
command_directory = /usr/sbin
controlled_senders = check_sender_access hash:/etc/postfix/blocked_senders
daemon_directory = /usr/libexec/postfix
data_directory = /var/lib/postfix
debug_peer_level = 2
debugger_command = PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin 
xxgdb $daemon_directory/$process_name $process_id & sleep 5

default_process_limit = 25
delay_logging_resolution_limit = 3
deliver_lock_attempts = 40
gwcheck = reject_unverified_recipient, reject_unauth_destination
home_mailbox = Maildir/
html_directory = no
inet_interfaces = all
inet_protocols = ipv4, ipv6
local_header_rewrite_clients = static:all
mail_name = IC-XC-NI-KA
mail_owner = postfix
mailq_path = /usr/bin/mailq.postfix
manpage_directory = /usr/share/man
message_size_limit = 41943040
meta_directory = /etc/postfix
milter_default_action = accept
mydestination = $myhostname, localhost.$mydomain, localhost
mydomain = noa.gr
myhostname = vmail2.noa.gr
mynetworks = 195.251.204.0/24, 195.251.202.0/23, 194.177.194.0/23, 
127.0.0.0/8, 10.201.0.0/16, [2001:648:2011::]/48, 83.212.5.24/29, 
[2001:648:2ffc:1115::]/64, 62.217.124.0/29, [2001:648:2ffc:126::]/64

myorigin = $mydomain
newaliases_path = /usr/bin/newaliases.postfix
non_smtpd_milters = $smtpd_milters
parent_domain_matches_subdomains =
postfwdcheck = check_policy_service inet:127.0.0.1:10040
queue_directory = /var/spool/postfix
readme_directory = /usr/share/doc/postfix3-3.2.4/README_FILES
recipient_canonical_maps = hash:/etc/postfix/domainrecipientmap
relay_domains = $mydestination
sample_directory = /usr/share/doc/postfix3-3.2.4/samples
sender_canonical_maps = hash:/etc/postfix/domainsendermap
sendmail_path = /usr/sbin/sendmail.postfix
setgid_group = postdrop
shlib_directory = /usr/lib/postfix
smtp_tls_security_level = may
smtpd_client_restrictions = 
permit_mynetworks,permit_sasl_authenticated,reject

smtpd_delay_reject = yes
smtpd_end_of_data_restrictions = check_client_access 

Re: TLS Handshake Problems

2017-11-28 Thread Viktor Dukhovni
On Tue, Nov 28, 2017 at 09:27:49PM +0200, Nikolaos Milas wrote:

> Nov 28 12:55:43 vmail2 postfix/smtp[782]: SSL_connect error to 
> rcs12.rc.auth.gr[155.207.51.12]:25: lost connection
> Nov 28 12:55:43 vmail2 postfix/smtp[782]: 4623A80004F2F: Cannot start TLS: 
> handshake failure
> Nov 28 12:55:43 vmail2 postfix/smtp[782]: SSL_connect error to 
> rcs13.rc.auth.gr[155.207.144.13]:25: lost connection
> Nov 28 12:55:43 vmail2 postfix/smtp[782]: 4623A80004F2F: Cannot start TLS: 
> handshake failure
> Nov 28 12:55:44 vmail2 postfix/smtp[782]: 4623A80004F2F: host 
> mailsrv1.ccf.auth.gr[155.207.1.1] said: 450 4.7.1 try again later (in reply 
> to DATA command)
> Nov 28 12:55:45 vmail2 postfix/smtp[782]: 4623A80004F2F: 
> to=, relay=mailsrv2.ccf.auth.gr[155.207.1.2]:25, 
> delay=2.4, delays=0.096/0.001/1.8/0.52, dsn=4.7.1, status=deferred (host 
> mailsrv2.ccf.auth.gr[155.207.1.2]
> said: 450 4.7.1 try again later (in reply to DATA command))
>...
> Nov 28 13:01:22 vmail2 postfix/smtp[782]: SSL_connect error to 
> rcs12.rc.auth.gr[155.207.51.12]:25: lost connection
> Nov 28 13:01:22 vmail2 postfix/smtp[782]: 4623A80004F2F: Cannot start TLS: 
> handshake failure
> Nov 28 13:01:22 vmail2 postfix/smtp[782]: 4623A80004F2F: 
> to=, relay=rcs12.rc.auth.gr[155.207.51.12]:25, 
> delay=340, delays=339/0.002/0.15/0.35, dsn=2.6.0, status=sent (250 2.6.0
> <20171128105543.4623a80004...@vmail2.noa.gr> Queued mail for delivery)

This is expected.  The first two MX hosts are ancient Microsoft
Exchange servers, probably on Windows 2003, where the TLS stack
is rather crippled.  Their only working TLS ciphersuite is RC4,
and they only see the first 64 ciphersuites in the TLS HELLO.

The third uses greylisting to force a later retry.

$ dig +short -t mx rc.auth.gr | sort -n | awk '{print $NF}' | sed 's/\.$//'
rcs12.rc.auth.gr
rcs13.rc.auth.gr
mailsrv1.ccf.auth.gr
mailsrv2.ccf.auth.gr

$ for mx in $(dig +short -t mx rc.auth.gr | sort -n | awk '{print $NF}' | 
sed 's/\.$//')
  do
posttls-finger -c -Lsummary,ssl-debug "[$mx]"; echo
  done
posttls-finger: SSL_connect:before/connect initialization
posttls-finger: SSL_connect:SSLv2/v3 write client hello A
posttls-finger: SSL_connect error to rcs12.rc.auth.gr[155.207.51.12]:25: 
lost connection

posttls-finger: SSL_connect:before/connect initialization
posttls-finger: SSL_connect:SSLv2/v3 write client hello A
posttls-finger: SSL_connect error to rcs13.rc.auth.gr[155.207.144.13]:25: 
lost connection

posttls-finger: SSL_connect:before/connect initialization
...
posttls-finger: Untrusted TLS connection established to 
mailsrv1.ccf.auth.gr[155.207.1.1]:25: TLSv1.2 with cipher AES256-GCM-SHA384 
(256/256 bits)

posttls-finger: SSL_connect:before/connect initialization
...
posttls-finger: Untrusted TLS connection established to 
mailsrv2.ccf.auth.gr[155.207.1.2]:25: TLSv1.2 with cipher AES256-GCM-SHA384 
(256/256 bits)

> On the older server, I didn't see such errors.

If you want to use TLS with these long past their use-by date
Exchange servers, you need to disable a bunch of unnecessary
ciphers that push RC4 out of the first 64 slots:

smtp_tls_exclude_ciphers = MD5, aDSS, kECDH, kDH, SEED, IDEA, RC2, RC5

With that, you'll be able to complete a TLS handshake even with
the first two MX hosts:

$ for mx in $(dig +short -t mx rc.auth.gr | sort -n | awk '{print $NF}' | 
sed 's/\.$//')
  do
posttls-finger -o "tls_medium_cipherlist=$(postconf -hd 
tls_medium_cipherlist):"'!MD5:!aDSS:!kECDH:!kDH:!SEED:!IDEA:!RC2:!RC5' -c -lmay 
-Lsummary "[$mx]"
  done
posttls-finger: Untrusted TLS connection established to 
rcs12.rc.auth.gr[155.207.51.12]:25: TLSv1 with cipher RC4-SHA (128/128 bits)
posttls-finger: Untrusted TLS connection established to 
rcs13.rc.auth.gr[155.207.144.13]:25: TLSv1 with cipher RC4-SHA (128/128 bits)
posttls-finger: Untrusted TLS connection established to 
mailsrv1.ccf.auth.gr[155.207.1.1]:25: TLSv1.2 with cipher AES256-GCM-SHA384 
(256/256 bits)
posttls-finger: Untrusted TLS connection established to 
mailsrv2.ccf.auth.gr[155.207.1.2]:25: TLSv1.2 with cipher AES256-GCM-SHA384 
(256/256 bits)

Disabling the above cipher classes is harmless, they should not be
needed in practice.  This is only needed for TLS with an increasingly
rare set of peer servers, those still stuck running long unsupported
14-year old Windows software.  With a bit of luck these will be
replaced some day soon, they probably have multiple unpatched security
issues.

-- 
Viktor.


Re: TLS Handshake Problems

2017-11-28 Thread Nikolaos Milas

On 28/11/2017 9:57 μμ, Viktor Dukhovni wrote:


This is expected.
...


Thank you Viktor for the detailed analysis and for your time.

I appreciate it very much.

All the best,
Nick


Kerberos principal name mismatch

2017-11-28 Thread Anvar Kuchkartaev
 ‎Hello I configured 2x postfix instances which uses shared disk as mail storage and they act as MX server and smtp server at same time. First server mx0.example.com and mx1.example.com and smtp.example.com points to ip address of both servers. The service principal smtp/mx0.example@example.com controlled by both hosts and saslauthd is also configured to use service keytab (entire system is managed by freeipa). If I use smtp.example.com (alias of service principal) to send emails I am getting following error:nov 28 23:44:21 mx0.example.com postfix/smtps/smtpd[6110]: GSSAPIserver step 1nov 28 23:44:21 mx0.example.com postfix/smtps/smtpd[6110]: warning:SASL authentication failure: GSSAPI Error: Unspecified GSSfailure.  Minor code may provide more information (Request ticketserver smtp/smtp.example@example.com found in keytab but does notmatch server principal smtp/mx0.example.com@)nov 28 23:44:21 mx0.example.com postfix/smtps/smtpd[6110]: warning:[xxx.xxx.xxx.xxx]: SASL GSSAPI authentication failed: authenticationfailureIf I use mx0.example.com (primary alias of service principal) I can send emails easily. Currently I am using plain auth as workaround to send emails. Does anyone have idea to solve/debug this?Anvar Kuchkartaev an...@anvartay.com 



What am I missing in this client check???

2017-11-28 Thread Bill Cole

I fear that I've misunderstood something for a very long time...

Why might hwsrv-205226.hostwindsdns.com not be hitting the (redundant, I 
think) entries here?


   bigsky:~ root# postmap -vs /etc/postfix/client_checks
   postmap: name_mask: ipv4
   postmap: inet_addr_local: configured 5 IPv4 addresses
   postmap: Compiled against Berkeley DB version 1
   postmap: dict_open: hash:/etc/postfix/client_checks
   getresponse.com  550 5.7.1 GetResponse spam unwanted
   user.veloxzone.com.br550 5.7.1 Veloxzone users may not mail here
   link 550 5.7.1 Get a real domain, spammy
   hostwindsdns.com	 550 5.7.1 Too many brute force attacks from your 
hosting provider
   .hostwindsdns.com	550 5.7.1 Too many brute force attacks from your 
hosting provider
   salsalabs.net	550 5.7.1 SalsaLabs has shitty list management 
practices.
   newsletterbroadcast.net	550 5.7.1 Hostway's spamming services not 
welcome here
   siteprotect.com	550 5.7.1 Hostway's spamming services not welcome 
here
   checkmail.io	550 5.7.1 Address Verification is a fraudulent 
business. GFY & DIAF


   bigsky:~ root# echo $?
   0

   bigsky:~ root# postmap -q hwsrv-205226.hostwindsdns.com -v 
/etc/postfix/client_checks

   postmap: name_mask: ipv4
   postmap: inet_addr_local: configured 5 IPv4 addresses
   postmap: Compiled against Berkeley DB version 1
   postmap: dict_open: hash:/etc/postfix/client_checks

   bigsky:~ root# echo $?
   1

Canonical config outputs follow:

   bigsky:~ root# postconf -nf
   body_checks = pcre:/opt/local/etc/postfix/body_checks
   bounce_size_limit = 5
   command_directory = /opt/local/sbin
   compatibility_level = 2
   daemon_directory = /opt/local/libexec/postfix
   data_directory = /opt/local/var/lib/postfix
   debug_peer_level = 3
   debug_peer_list = 127.0.0.1
   debugger_command = PATH=/opt/local/bin:/bin:/usr/bin:/usr/local/bin; 
export
   PATH; (echo cont; echo where) | gdb 
$daemon_directory/$process_name
   $process_id 2>&1 
>$config_directory/$process_name.$process_id.log & sleep 5

   default_database_type = hash
   default_destination_concurrency_limit = 10
   disable_vrfy_command = yes
   enable_long_queue_ids = yes
   header_checks = regexp:/opt/local/etc/postfix/header_checks
   home_mailbox = Maildir/
   html_directory = no
   inet_interfaces = all
   inet_protocols = ipv4
   mail_owner = _postfix
   mailq_path = /opt/local/bin/mailq
   manpage_directory = /opt/local/share/man
   message_size_limit = 4096
   milter_command_timeout = 120s
   milter_connect_timeout = 45s
   milter_rcpt_macros = i {rcpt_addr} {rcpt_host} {rcpt_mailer}
   mydestination = $myhostname, localhost.$mydomain_fallback
   mydomain = scconsult.com
   mydomain_fallback = scconsult.com
   myhostname = toaster.scconsult.com
   mynetworks = 192.168.254.0/24
   mynetworks_style = subnet
   myorigin = $myhostname
   newaliases_path = /opt/local/bin/newaliases
   postscreen_access_list = permit_mynetworks
   postscreen_disable_vrfy_command = yes
   postscreen_dnsbl_action = enforce
   postscreen_dnsbl_sites = cbl.abuseat.org=127.0.0.2*2
   zen.spamhaus.org=127.0.0.2*2 zen.spamhaus.org=127.0.0.3*2
   zen.spamhaus.org=127.0.0.4*2 zen.spamhaus.org=127.0.0.10*2
   zen.spamhaus.org=127.0.0.11*2 korea.services.net=127.0.0.2*2
   blackholes.scconsult.com=127.0.0.2*1 
sbcdyn.scconsult.com=127.0.0.2*1

   psbl.surriel.com=127.0.0.2*1 ix.dnsbl.manitu.net=127.0.0.2*1
   postscreen_dnsbl_threshold = 2
   postscreen_dnsbl_ttl = 10m
   postscreen_greet_action = drop
   postscreen_helo_required = $smtpd_helo_required
   postscreen_whitelist_interfaces = !127.0.0.2,static:all
   proxy_interfaces = 67.149.19.3, 67.149.19.4, 67.149.19.5
   queue_directory = /opt/local/var/spool/postfix
   readme_directory = /opt/local/share/postfix/readme
   recipient_delimiter = -
   sample_directory = /opt/local/share/postfix/sample
   sender_bcc_maps = pcre:/etc/postfix/sender_bccs
   sendmail_path = /opt/local/sbin/sendmail
   setgid_group = _postdrop
   sewers = check_recipient_access 
pcre:/opt/local/etc/postfix/sewer-recipients

   check_sender_access pcre:/opt/local/etc/postfix/sewer-senders
   smtp_connection_cache_destinations =
   smtp_dns_support_level = dnssec
   smtp_generic_maps = regexp:/opt/local/etc/postfix/generic
   smtp_tls_CAfile = /opt/local/etc/openssl/cert.pem
   smtp_tls_loglevel = 1
   smtp_tls_security_level = dane
   smtpd_authorized_xclient_hosts = localhost
   smtpd_client_auth_rate_limit = 5
   smtpd_client_connection_count_limit = 20
   smtpd_client_connection_rate_limit = 6
   smtpd_client_message_rate_limit = 15
   smtpd_client_new_tls_session_rate_limit = 5
   smtpd_client_recipient_rate_limit = 20
   smtpd_client_restrictions = check_client_access
   hash:/opt/local/etc/postfix/client_checks, permit
   smtpd_data_restrictions =
   reject_multi_recipient_bounce,reject_unauth_pipelining,permit
   smtpd_delay_open_until_valid_rcpt = no
   smtpd_error_sleep_time = 3
   smtpd_

Re: Kerberos principal name mismatch

2017-11-28 Thread Viktor Dukhovni
On Wed, Nov 29, 2017 at 12:31:21AM +0100, Anvar Kuchkartaev wrote:

>Hello I configured 2x postfix instances which uses shared disk as mail
>storage and they act as MX server and smtp server at same time. First
>server mx0.example.com and mx1.example.com and smtp.example.com points
>to ip address of both servers. The service principal
>smtp/mx0.example@example.com controlled by both hosts and saslauthd
>is also configured to use service keytab (entire system is managed by
>freeipa). If I use smtp.example.com (alias of service principal) to
>send emails I am getting following error:
> 
>nov 28 23:44:21 mx0.example.com postfix/smtps/smtpd[6110]: GSSAPI
>server step 1
>nov 28 23:44:21 [1]mx0.example.com postfix/smtps/smtpd[6110]: warning:
>SASL authentication failure: GSSAPI Error: Unspecified GSS
>failure.  Minor code may provide more information (Request ticket
>server smtp/[2]smtp.example@example.com found in keytab but does
>not
>match server principal smtp/[3]mx0.example.com@)
>nov 28 23:44:21 [4]mx0.example.com postfix/smtps/smtpd[6110]: warning:
>[[5]xxx.xxx.xxx.xxx]: SASL GSSAPI authentication failed: authentication
>failure
> 
>If I use mx0.example.com (primary alias of service principal) I can
>send emails easily. Currently I am using plain auth as workaround to
>send emails. Does anyone have idea to solve/debug this?

The Cyrus SASL library does not support wildcard server credentials
(GSS_C_NO_CREDENTIAL).  Instead each SASL service must specify an
explicit service name (service@host) and this must be the name for
which clients obtain tickets.  IIRC Postfix passes "smtp@myhostname"
to Cyrus SASL as its service name.  Therefore, any given Postfix
instance can only support Kerberos clients that expect to connect
to the hostname that exactly matches the main.cf "myhostname"
setting.  The keytab file can of course be shared, and contain
one entry for each Postfix instance hostname.

The "dovecot" SASL backend does not share the same limitation.  So
if you configure Postfix to use the "dovecot" SASL backend, you
should be able to support multiple names in a single instance.

My dovecot configuration has:

auth_realms = ...
auth_mechanisms = gssapi plain
auth_gssapi_hostname = "$ALL"
auth_krb5_keytab = /var/spool/keytabs/imap

That magic "$ALL" token enables wildcard credentials, the
server will accept tickets for any principal name with keys
in the selected keytab file.

-- 
Viktor.


Re: What am I missing in this client check???

2017-11-28 Thread Viktor Dukhovni
On Tue, Nov 28, 2017 at 08:57:05PM -0500, Bill Cole wrote:

> I fear that I've misunderstood something for a very long time...
> 
> Why might hwsrv-205226.hostwindsdns.com not be hitting the (redundant, I
> think) entries here?

The postmap(1) command does not support any of the partial key
lookups performed by Postfix when doing access(5), transport(5),
virtual(5), generic(5), ... lookups.  Indeed not all these higher-level
map types generate partial keys in the same way.  So postmap cannot
without further (not-yet implemented options) know how to do these
lookups.

>bigsky:~ root# postmap -vs /etc/postfix/client_checks
>.hostwindsdns.com  550 5.7.1 Too many brute force attacks from your 
> hosting provider

That's a partial key.

>bigsky:~ root# postmap -q hwsrv-205226.hostwindsdns.com -v 
> /etc/postfix/client_checks

This key is not present (verbatim) in the lookup table.

-- 
Viktor.


Re: What am I missing in this client check???

2017-11-28 Thread Bill Cole

On 28 Nov 2017, at 21:09 (-0500), Viktor Dukhovni wrote:


On Tue, Nov 28, 2017 at 08:57:05PM -0500, Bill Cole wrote:


I fear that I've misunderstood something for a very long time...

Why might hwsrv-205226.hostwindsdns.com not be hitting the 
(redundant, I

think) entries here?


The postmap(1) command does not support any of the partial key
lookups performed by Postfix when doing access(5), transport(5),
virtual(5), generic(5), ... lookups.  Indeed not all these 
higher-level

map types generate partial keys in the same way.  So postmap cannot
without further (not-yet implemented options) know how to do these
lookups.


Which also answers my unasked question: "Why isn't there a postmap 
option to specify a lookup strategy?"


Thanks, Viktor. I have confirmed the desired behavior elsewise:

bigsky:~ root# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 toaster.scconsult.com ESMTP Postfix
xclient name=hwsrv-205226.hostwindsdns.com
220 toaster.scconsult.com ESMTP Postfix
ehlo hwsrv-205226.hostwindsdns.com
250-toaster.scconsult.com
250-PIPELINING
250-SIZE 4096
250-ETRN
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-8BITMIME
250-DSN
250 SMTPUTF8
mail from:<>
250 2.1.0 Ok
rcpt to:
550 5.7.1 : Client host 
rejected: Too many brute force attacks from your hosting provider

quit
221 2.0.0 Bye
Connection closed by foreign host.

--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Slow delivery of mails reg.

2017-11-28 Thread anant

Hi,

We have a internal relay configured in our setup.  The relay host had 
stopped accepting mails as the system was hanging and we could resolve

it only after about 8 hrs.  Meanwhile the queue built up and there
were about 80,000 mails in queue to relay to the relay host.  Once the
system was made up, mails started delivering to the relay host at a
very slow rate. It took almost 10 hours to clear the queue.

What I could understand by googling was, postfix detected the relay
host to be dead and due to the parameters like minimal_backoff_time
and maximal_backoff_time, the delivery rate to the relay host was slow. 

I am not sure, whether this understanding is correct. 

What is the correct way to come out of this situation and deliver the
mails faster after recovery of relay host.

Waiting for your valuable suggestions.
 Anant S Athavale
--
Confidentiality Notice: This e-mail message, including any attachments, is for
the sole use of the intended recipient(s) and may contain confidential and
privileged information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message.
--



Re: Slow delivery of mails reg.

2017-11-28 Thread Viktor Dukhovni
On Wed, Nov 29, 2017 at 09:20:21AM +0530, an...@isac.gov.in wrote:

> We have a internal relay configured in our setup.  The relay host hadstopped
> accepting mails as the system was hanging and we could resolve
> it only after about 8 hrs.  Meanwhile the queue built up and there
> were about 80,000 mails in queue to relay to the relay host.  Once the
> system was made up, mails started delivering to the relay host at a
> very slow rate. It took almost 10 hours to clear the queue.

This is almost certainly because the relay host could not process
the resumed mail flow faster.  Another possibility is that there
are residual DNS issues that slow down each delivery.

> What I could understand by googling was, postfix detected the relay
> host to be dead and due to the parameters like minimal_backoff_time
> and maximal_backoff_time, the delivery rate to the relay host was slow. 

This is wrong.  The dead host detection state is cached only for a short
time, and cannot account for slow delivery once the host is back up.

> What is the correct way to come out of this situation and deliver the
> mails faster after recovery of relay host.

Make sure the relay host can process email quickly, find and resolve
all DNS-related issues.

To understand why delivery was slow you need to analyze your logs
and examine the "delays=" log entries to see whether the delays
in connecting to the relay host or waiting for deliveries to
complete.

-- 
Viktor.


Re: Slow delivery of mails reg.

2017-11-28 Thread anant

 - Message from Viktor Dukhovni  -
    Date: Wed, 29 Nov 2017 04:47:35 +
    From: Viktor Dukhovni 
Reply-To: postfix-users@postfix.org
Subject: Re: Slow delivery of mails reg.
      To: postfix-users@postfix.org


On Wed, Nov 29, 2017 at 09:20:21AM +0530, an...@isac.gov.in wrote:


We have a internal relay configured in our setup.  The relay host hadstopped
accepting mails as the system was hanging and we could resolve
it only after about 8 hrs.  Meanwhile the queue built up and there
were about 80,000 mails in queue to relay to the relay host.  Once the
system was made up, mails started delivering to the relay host at a
very slow rate. It took almost 10 hours to clear the queue.


This is almost certainly because the relay host could not process
the resumed mail flow faster.  Another possibility is that there
are residual DNS issues that slow down each delivery.


What I could understand by googling was, postfix detected the relay
host to be dead and due to the parameters like minimal_backoff_time
and maximal_backoff_time, the delivery rate to the relay host was slow. 


This is wrong.  The dead host detection state is cached only for a short
time, and cannot account for slow delivery once the host is back up.


What is the correct way to come out of this situation and deliver the
mails faster after recovery of relay host.


Make sure the relay host can process email quickly, find and resolve
all DNS-related issues.

To understand why delivery was slow you need to analyze your logs
and examine the "delays=" log entries to see whether the delays
in connecting to the relay host or waiting for deliveries to
complete.

--        Viktor.


The log entries with delays is as below.

relay=x.x.x.x[x.x.x.x]:25, conn_use=3, delay=32027, delays=31986/31/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=5, delay=27740, delays=27690/40/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=6, delay=13069, delays=13010/49/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=7, delay=21476, delays=21406/60/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=2, delay=33553, delays=33474/69/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=12, delay=14591, delays=14493/88/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=4, delay=25953, delays=25807/136/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=6, delay=27339, delays=25970/1359/0/10,
relay=x.x.x.x[x.x.x.x]:25,conn_use=16, delay=7608, delays=4510/3088/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=19, delay=7457, delays=2593/4854/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=20, delay=38734, delays=33864/4860/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=7, delay=11656, delays=5728/5918/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=19, delay=40339, delays=31320/9009/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=13, delay=10820, delays=441/10369/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=10, delay=39517, delays=29103/10404/0/10
relay=x.x.x.x[x.x.x.x]:25,conn_use=28, delay=43790, delays=33444/10336/0/10

the above log entries are at different times on the same day.

Please share your analysis for further improving our setup.

Regadrs,
ANANT.

- End message from Viktor Dukhovni  -
 Anant S Athavale
--
Confidentiality Notice: This e-mail message, including any attachments, is for
the sole use of the intended recipient(s) and may contain confidential and
privileged information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message.
--



Re: Slow delivery of mails reg.

2017-11-28 Thread Viktor Dukhovni
On Wed, Nov 29, 2017 at 12:04:14PM +0530, an...@isac.gov.in wrote:

> > To understand why delivery was slow you need to analyze your logs
> > and examine the "delays=" log entries to see whether the delays
> > in connecting to the relay host or waiting for deliveries to
> > complete.
> > 
> > --        Viktor.
> 
> The log entries with delays is as below.
> 
> relay=x.x.x.x[x.x.x.x]:25, conn_use=3, delay=32027, delays=31986/31/0/10

Well, the main issue here is the consistent 10s delay to complete
a delivery after 0s connection latency.  I would guess that's some
sort of timeout on the relay end resolving the IP address of the
connecting client.  With 10s to complete each delivery, your
throughput is not surprisingly quite low.

> relay=x.x.x.x[x.x.x.x]:25,conn_use=5, delay=27740, delays=27690/40/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=6, delay=13069, delays=13010/49/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=7, delay=21476, delays=21406/60/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=2, delay=33553, delays=33474/69/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=12, delay=14591, delays=14493/88/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=4, delay=25953, delays=25807/136/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=6, delay=27339, delays=25970/1359/0/10,
> relay=x.x.x.x[x.x.x.x]:25,conn_use=16, delay=7608, delays=4510/3088/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=19, delay=7457, delays=2593/4854/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=20, delay=38734, delays=33864/4860/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=7, delay=11656, delays=5728/5918/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=19, delay=40339, delays=31320/9009/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=13, delay=10820, delays=441/10369/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=10, delay=39517, delays=29103/10404/0/10
> relay=x.x.x.x[x.x.x.x]:25,conn_use=28, delay=43790, delays=33444/10336/0/10
> 
> the above log entries are at different times on the same day.
> 
> Please share your analysis for further improving our setup.

Fix the downstream delivery latency.  The relay is either having
trouble with DNS resolution or is actively misconfigured to impose
an artificial delay on client connections, perhaps some sort of
anti-spam measure that could make some sense on a host getting
mail directly from the Internet, but is a bad idea for internal
relays.

-- 
Viktor.