Hi,
I have just committed a new plugin called hash_greylist
http://github.com/MichalH/qpsmtpd/raw/master/plugins/hash_greylist. It
is mostly based on greylisting plugin by Gavin Carr.

How it works?
-Instead of tracking sender IP or triplet IP/sender/recipient as in
classic greylisting it tracks MD5 hash of each message. The hash is
computed using: message body, sender address, recipient addresses and
message-id header.
-Plugin works in data_post phase
-When remote IP and MD5 hash is seen for the first time message is soft denied.
-When hash is seen for the second time remote IP is whitelisted

How is it better/worse than classic greylisting?
-it doesn't save the bandwidth because it bounces the message in data_post phase
+it is immune to IP changes when sender server retries (ie. Gmail).
You don't have to make any whitelist of server which is not compatible
with greylisting.
-it is a little bit more CPU consuming (MD5 hashing)
+it bounces almost all botnet spam, messages from worms not speaking RFC, etc.
+I don't have any stats but I found that most worms which are able to
pass greylisting expect that message will be bounced in MAIL or RCPT
TO phase. If message is bounced after DATA phase there is no retry.

I've been using this plugin on 4 production servers for more than a
month and I have observed two minor problems:
-black_timeout cannot be more than 60 seconds. Hotmail and Exchange
2007 servers regards the bounce as transient and retries every 60
seconds by default
-hash database grow infinitely so it needs to be cleaned. Cleaning
procedure is launched every 1 hour by default

Here is monthly statistics from host which works as backup MX and gets
about 500000 messages a month:

13      check_earlytalker
22429   require_resolvable_fromhost
41      rhsbl
132091  check_badmailfrom
559     count_unrecognized_commands
81641   check_basicheaders
2142    virus::clamdscan
215258  hash_greylist
83      spamassassin
23195   rcpt_ok
77      queued

Order of plugins is as it is configured.

Here is monthly statistics from another host which works as primary MX
and gets about 50000 messages a month:
1163    check_earlytalker
169     require_resolvable_fromhost
7316    check_spamhelo
25979   require_resolvable_client
4061    check_basicheaders
693     sender_permitted_from
62      virus::clamdscan
4873    hash_greylist
252     spamassassin
98      rcpt_ok
2769    queued

These examples shows that only 0.03% and 6% of messages are scanned
using spamassassin which is priceless in terms of memory consumption,
CPU consumption, false positives and false negatives.

Comments are very welcome.

Thanks
Michal H.


PS. Suggestions of Gmail spell checking for word 'spamassassin' are:
'spam assassin', 'spermatozoon' and 'circumcising' :)

Reply via email to