On 09/02/2010 02:19 PM, Patrick Proniewski wrote:
Hi all,
intro: I won't ask for amavis fix, I just need to make sure my postfix config
is ok before getting support elsewhere with amavis ;)
I've a bit of trouble with my production mail gateway:
FreeBSD 7.x in VMWare Virtual Machine, running on top of a 6 blades HP
chassis, 4Go RAM and 2 CPUs for the VM,
disk provided by an FC SAN (big array of FC disks).
Postfix 2.7.1 in postmulti mode (3 instances: smtp for inbound,
smtp-liste for list inbound, mailgw for rewrite gateway).
Emails come from external clients to "smtp" and "smtp-liste", mails with local domain are feed into
"mailgw" that ensure the recipient address is properly rewritten. Mails passing thru "smtp" are "before
queue content filtered" via amavisd-new/clamav.
Theses days I've got a lot of warning in postfix logs like this one:
smtp/smtpd[91607]: warning: timeout talking to proxy 127.0.0.1:10024
Why ? What is amavis doing at that moment ?
Of course, amavisd listens on 127.0.0.1:10024
And I've got that:
Sep 2 13:00:47 ru amavis[87682]: (87682-15) TIMING [total 257879 ms] -
SMTP greeting: 25055 (10%)10, SMTP EHLO: 0 (0%)10, SMTP pre-MAIL: 0
(0%)10, SMTP pre-DATA-flush: 7 (0%)10,
SMTP DATA: 24052 (9%)19, check_init: 25053 (10%)29, digest_hdr: 1
(0%)29, digest_body: 0 (0%)29,
gen_mail_id: 21050 (8%)37, mime_decode: 21063 (8%)45, get-file-type1:
21 (0%)45, decompose_part: 1 (0%)45,
parts_decode: 0 (0%)45, check_header: 2 (0%)45, AV-scan-1: 21058
(8%)53, spam-wb-list: 5 (0%)53,
update_cache: 2 (0%)53, decide_mail_destiny: 25156 (10%)63,
fwd-connect: 64265 (25%)88, fwd-xforward: 1 (0%)88,
fwd-mail-pip: 5 (0%)88, fwd-rcpt-pip: 1 (0%)88, fwd-data-chkpnt: 0
(0%)88, write-header: 2 (0%)88,
fwd-data-contents: 0 (0%)88, fwd-end-chkpnt: 4 (0%)88, prepare-dsn: 1
(0%)88, main_log_entry: 12 (0%)88,
update_snmp: 2 (0%)88, SMTP pre-response: 31057 (12%)100, SMTP
response: 0 (0%)100, unlink-2-files: 1 (0%)100,
rundown: 2 (0%)100
258 seconds to filter, it's not good at all.
decide_mail_destiny, fwd-connect and some SMTP related steps are especially
bad. So I don't think I've a hardware contention (RAM is far from full, disk is
quite speedy and 80% empty), but may be a postfix misconfiguration.
More likely an *amavis* problem. 25 seconds just to greet the client is
insane.
Also, 21 seconds for an AV scan is rather long - this kind of checking
is best left to after-queue filters.
Here are my Postfix setting about connections and process numbers. Do you see
something wrong here that could explain partly my problem?
in main.cf:
smtpd_hard_error_limit = ${stress?3}${stress:20}
smtpd_junk_command_limit = ${stress?3}${stress:100}
in master.cf:
159.x.x.x:smtp inet n - n - 70 smtpd
-o smtpd_proxy_filter=127.0.0.1:10024
-o smtpd_client_connection_count_limit=20
-o smtpd_proxy_ehlo=amavis.at.univ-lyon2.fr
-o inet_interfaces=159.x.x.x
-o smtpd_timeout=300<-- I just raised this one, but with no luck.
-o header_checks=regexp:/usr/local/etc/postfix/header_checks
#
# After-filter SMTP server. Receive mail from the content filter
# on localhost port 10025.
#
127.0.0.1:10025 inet n - n - - smtpd
-o smtpd_authorized_xforward_hosts=127.0.0.0/8
-o local_recipient_maps=
-o relay_recipient_maps=
-o smtpd_recipient_classes=
-o smtpd_delay_reject=no
-o smtpd_client_restrictions=
-o smtpd_helo_restrictions=
-o smtpd_sender_restrictions=
-o smtpd_recipient_restrictions=permit_mynetworks,reject
-o smtpd_data_restrictions=reject_unauth_pipelining
-o smtpd_end_of_data_restrictions=
-o mynetworks=127.0.0.1/32,localhost,localhost.localdomain
-o strict_rfc821_envelopes=yes
-o smtpd_error_sleep_time=0
-o smtpd_soft_error_limit=1001
-o smtpd_hard_error_limit=1000
-o smtpd_client_connection_count_limit=0
-o smtpd_client_connection_rate_limit=0
I'm not seeing nearly enough. You'd have to provide postconf -n for each
instance involved, and the relevant master.cf entries for them.
Amavisd spawns 80 child process at launch time (more than the 70 smtpd
available, then). So when a child is killed, I've already a fresh one available
before amavisd can respawn one.
Let's revisit your specs:
FreeBSD 7.x in VMWare Virtual Machine, running on top of a 6 blades HP
chassis, 4Go RAM and 2 CPUs for the VM,
disk provided by an FC SAN (big array of FC disks).
ONE amavisd instance can consume up to 60~80 MB of private memory; times
80, this means you're using 6.4 GB of memory you _don't actually have_.
And that's just amavisd; for such a big box you're surely running a
local BIND cache - that likes memory too.
I'm betting that VM is swapping like a *maniac*.
Also, to RUN those 80 amavis threads, I would start with at least 4 cores.
The postfix portion of the resources is pretty much limited to having
enough disk throughput - which you have.
Assign a huge buttload of better resources to this VM - say, 8GB and 4
cores.
Or seriously throttle down the number of amavis threads you're running -
what kind of volume are we looking at, that you need 70 concurrent smtpds ?
Does the smtpd_client_connection_count_limit include the Proxy connexions?
Um - whut ?
You're specifying smtpd_client_connection_count_limit on the *incoming*
smtp server.
Amavis doesn't talk to the incoming smtp server.
In an attempt to prevent "smtp-liste" to "eat" every possible connexion to
"smtp" during a local emailing, I've set it's smtp_destination_concurrency_limit to 1, so that
other more legitimate clients (30000 physical users) can still send emails during a local emailing.
I have no idea what this means - you're limiting the *sending* of mail
so that your clients - for whom postfix *receives* mail - aren't
bottlenecked ?
It is very rare for the actual sending of internet mail to be the
bottleneck - unless you're spamming, of course.
Let me know if you need other info…
Patrick PRONIEWSKI