I've used spamassassin for many years - on Ubuntu, using amvisd - with great success. In recent months, I've been receiving several spam messages each day that evade the filters.

* These false-negatives conform to a handful of simple, formulaic, textual forms - on common subjects. * The emails consist fairly plain HTML and appear not to employ any significant obfuscation. * I have tried to train spamassassin with many of these spam samples - without any effect. * The bayes database is updated. The bayes_journal (37k), bayes_seen (5.2mb) and bayes_toks (5.4mb) files all have recent timestamps. * The false positives all match BAYES_00 - attracting a default score of -1.9. BAYES_00 seems to be at the crux of the misclassification.

Is there a way to delve into why these messages have been allocated such a low bayes score - while (to a human) appearing blatant, simple, spam on "vanilla" spam topics? Has my bayes data been "poisoned" somehow? It is worth noting that I get a lot of correctly identified spam - and much of that matches BAYES_99 and BAYES_999... and my ham gets BATES_00... so, for many messages, bayes is working. Is it likely that I am suffering poor performance (for these specific messages) as a result of some tunable parameter?

What is the most effective way to tackle this?

Reply via email to