On Thu, 26 Jul 2007, martin f krafft wrote:
Hi list,
I just had a flood of spam coming through, which SA classified as
ham. On closer inspection, it turns out that the only tests
triggered for all those mails were HTML_MESSAGE and BAYES_99.
HTML messages are commonplace today (unfortunately), so they don't
add anything to the score.
BAYES_99 yields 3.5 points.
Simply make BAYES_99 4.0 points and set the treshold to 4.0 as well. Of
course it's a very agressive setting, but it just works for me. I haven't
ever noticed any false-positive to get BAYES_99, and ham mails usually
have some negative score from other rules. There are very few false
positives, but they match more rules (usually some of those RBL ones) and
they are usually over 5.0, so it doesn't pay not to treat mail with 4.0
score as ham anyway, because you only get more spam and most
false-postives will still be over the treshold. Sorting out spam of
4.0-6.0 score to another folder, briefly checked once a day, does the job.
I have about 1200 mails a day that SA scores (~1150 spams) and hardly any
false positives (~5 per week).
Of course, things get complicated if you have users that are not so well
into mail filtering and want just to download their mail by POP3. 4.0
treshold would be too low for them, but unless you find any fingerprints
in those messages and create appropriate rules - you can't do anything.
I've been collecting SA results into SQL database for some time now, so
some day I'll try to create statistics of those messages in the most
confusing 4-6 score range. In my opinion this is the key to fine-tune
spamassassin.
--
Michał Jęczalik, +48.603.64.62.97
INFONAUTIC, +48.33.487.69.04