Rick Knight wrote:
> John,
> 
> What are using to filter on HELO-no-dots? I've looked at milter-regex,
> but I can't get it to build on my slackware 12 system.

That would be the __HELO_NO_DOMAIN rule, modified from vanilla 3.2.5
by updates.spamassassin.org to something less useful and then reverted
back by Justin Mason in subversion, see
http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/sandbox/jm/20_basic.cf?revision=825439&view=markup#l84

Scoring at http://ruleqa.spamassassin.org/week/__HELO_NO_DOMAIN/detail
>> MSECS    SPAM%     HAM%     S/O    RANK   SCORE  NAME
>>     0  19.9863   1.1186   0.947    0.61   (n/a)  __HELO_NO_DOMAIN

Included in khop-general (be wary of wrapping):

# from SVN at rulesrc/sandbox/jm/20_basic.cf
header __HELO_NO_DOMAIN
X-Spam-Relays-External =~ /^[^\]]+ helo=[^\.]+ /

meta  KHOP_NO_FQDN   __HELO_NO_DOMAIN && (RDNS_NONE || RDNS_DYNAMIC)
describe KHOP_NO_FQDN  HELO: not a domain, no static reverse DNS on IP
score KHOP_NO_FQDN     0.5     # 20090603

I used  (RDNS_NONE || RDNS_DYNAMIC)  in an attempt to limit the damage
to ham ... my recollection is that the rulesqa stats were less
favorable when I wrote the rule back in June.  I saved a copy of
__HELO_NO_DOMAIN spam/ham hits over time (those disappear
occasionally) at http://yfrog.com/athelonodomainhist2009101g -- it
does appear to have had more FPs.

This rule needs to be revisited as it doesn't hit anything despite the
fact that it blends only high-traffic rules:

rule             my spam%   corpus%  %of RDNS_NONE   %of RDNS_DYN
__HELO_NO_FQDN   unknown     20.0%        86%            <21%
RDNS_NONE         18.8%      57.6%       100%              0%
RDNS_DYNAMIC       9.9%      25.6%         0%            100%
KHOP_NO_FQDN       0.1%     unknown     (2.2%)            (0%)

If you're wondering why these are so low ... I use greylisting, which
is specifically good at picking out what these rules catch.  Assuming
86% overlap with RDNS_NONE (and no overlap with RDNS_DYNAMIC),
KHOP_NO_FQDN would catch 50% of the spam corpus, which is serious
stuff, but using my own overlap number of 2.2%, that's 1.27%, which
might not be so bad.  (Parenthesis are my own data since no data for
the masscheck is available.)

Reply via email to