Ali Majdzadeh put forth on 12/1/2009 12:25 AM:
> Dear friends,
> Thanks for this nice discussion. Actually, as a project, we are going to
> deliver an e-mail architecture which supports over 1000000 users. We use
> Postfix, courier-imap, amavisd-new, spamassassin and clamav and of
> course the tools needed to balance the load between multiple instances
> of the mentioned tools. We use specmail to test our architecture.
> Recently, we have introduced our intended e-mail filtering platform
> consisting amavisd-new, spamassassin and clamav to the architecture and
> we have observed significant delivery time decrease regarding Postifx.
> As a way out, we thought of the ways which made it possible to do
> offline virus scanning, but actually we have found that amavisd-new
> together with it's filtering tools is a serious performance bottleneck.
> I really appreciate suggestions regarding this scenario.

Hi Ali,

First off, this is an edge solution, correct?  These Postfix servers are
MX hosts?  If so...

I humbly, but seriously, suggest you hire Victor or another highly
qualified Postfix engineer to assist you with architecting your 1
million user solution.  Also, SpecMail 2009 is not a valid test of what
your real world mail stream will be once you go live.  You absolutely
cannot rely on this benchmark to give you realistic feedback on the
performance of your architecture.  It doesn't, and cannot, simulate real
spam streams.  And spam attempts will be 50-90% of your real world
connection load.

Summary:

 SPECmail2009

The SPECmail2009 benchmark measures the ability of corporate e-mail
systems to meet today's demanding e-mail users over fast corporate local
area networks (LAN). The SPECmail2009 benchmark simulates corporate mail
server workloads that range from 250 to 10,000 or more users, using
industry standard SMTP and IMAP4 protocols. This e-mail server benchmark
creates client workloads based on a 40,000 user corporation, and uses
folder and message MIME structures that include both traditional office
documents and a variety of rich media content. The benchmark also adds
support for encrypted network connections using industry standard SSL
v3.0 and TLS 1.0 technology. SPECmail2009 replaces all versions of
SPECmail2008, first released in August 2008. The results from the two
benchmarks are not comparable. With the availability of SPECmail2009,
SPEC has retired the SPECmail2008 benchmark. SPEC will stop accepting
new SPECmail2008 results as of the submission deadline on June 12, 2009.


For a 1 million user system, you absolutely need to kill 90%+ of your
spam load _before_ piping inbound connections to your AS/AV content
filter daemons.  You are seeing why already with the results of this
synthetic benchmark pumping only _legit_ mail through your system.  Of
your inbound spam, you should be able to kill on the order of 50-80% or
more, with merely the following, _BEFORE_ piping to SpamAssassin,
clamav, or amavisd-new:

smtpd_client_restrictions =
        reject_unknown_client_hostname
        reject_unauth_pipelining

smtpd_sender_restricions =
        reject_non_fqdn_sender

smtpd_helo_required = yes
smtpd_helo_restrictions =
        reject_non_fqdn_helo_hostname
        reject_invalid_helo_hostname
        reject_unknown_helo_hostname

smtpd_recipient_restrictions =
        permit_mynetworks
        reject_unauth_destination
        reject_unlisted_recipient
        reject_rbl_client zen.spamhaus.org
        check_policy_service inet:127.0.0.1:60000

For a 1 million user site, you'll need to make arrangements with
Spamhaus to get access to the Data Feed Service.  The above usage
example is for smaller sites with low query rates.  You'd need to run
rbldnsd on your postfix servers or mirror the Spamhaus zone(s) on a
local dns server.  That's beyond the scope of this email.

The policy service above is the Postfix greylisting daemon called
postgrey.  It is very effective against residential broadband infected
PCs, or botnets.  It will kill a ton of spam without consuming near the
resources or content filters.

The bulk of efficient spam blocking is performed based on the following:

1.  Client IP address reputation (think dnsbl, local block lists)
2.  Client FCrDNS (PTR name), lack thereof or generic (think dsl/cable)
3.  Improper HELO/EHLO string

SPECmail cannot simulate any of these things because they're all based
on IP address or DNS.  Let me say that again:  SPECmail cannot simulate
any of these things.  Yet, they are the most important aspects of
architecting an efficient large internet mail system because, again,
50-90% of an org's mail stream is spam.

The following simple header check will kill most spam from hijacked
accounts at Yahoo, Google, Hotmail, and private orgs running the likes
of Squirrelmail, etc:

header_checks = pcre:/etc/postfix/header_checks

/etc/postfix/header_checks

# Reject spam from compromised accounts/hosts

/^Received: from user /         REJECT Compromised account


This is not a complete list of anti-spam techniques, but it is a very
good start, and processing these is extremely efficient compared to
content filters.  Most malware and phish arrive via spam, so killing
these early with the above techniques relieves a huge load from your
content filters, dramatically increasing overall system efficiency.

Given that the inbound email load for most organizations consists of
anywhere from 50-90% spam, you should implement all the techniques you
can, starting with those I mention above, that kill spam before piping
it to your resource intensive content filters.

All environments vary to a degree.  That said, at my small domain, I
don't run any content filters at all, yet kill ~99.9% of inbound spam
via the means described above, and via local block lists--bot spam,
snowshoe, mainsleeze, etc.  I'm killing it all, again, without content
filters.  At 1 million users, you will still need content filters,
merely due to extrapolating percentages.  In my system, 1 of 1000 spams
gets through my filters.  If you used my filters alone, that would
roughly equal approximately 1,000 spams/malware/viri making it into user
mailboxes per each user receiving one email, which is unacceptable, even
though "99.9%" sounds great.  You need more like a 99.999% spam killing
rate, though that might be wishful thinking.

Hope you find some of this information valuable/useful.

--
Stan

Reply via email to