On 21 Jul 2015, at 20:55, Roman Gelfand wrote:

It seems that if DKIM or SPF is verified, the bayesian learning doesn't
matter.

Not so. Perhaps you need to refresh your understanding of what SpamAssassin is. It is not a collection of binary switches, but rather a scoring system consisting of rules which have various scores.

How much each rule matters is a local decision, subject to default values

X-Spam-Status: No, score=3.6 required=5.0 tests=BAYES_99,BAYES_999,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,SPF_PASS autolearn=no version=3.3.2

3.3.2 is rather obsolete, but I still have the defaultrules laying about...

/var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score BAYES_99 0 0 3.8 3.5 /var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score BAYES_999 0 0 0.2 0.2 /var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score DKIM_SIGNED 0.1 /var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score DKIM_VALID -0.1 /var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score DKIM_VALID_AU -0.1 /var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score HTML_MESSAGE 0.001 /var/spamassassin/3.003002/updates_spamassassin_org/50_scores.cf:score SPF_PASS -0.001

The arithmetic, assuming you allow network tests: The 2 Bayes rules (de facto Bayes certitude of spaminess) only add up to 3.7. All of the DKIM and SPF crap nets out to -0.101, vastly overstating their value in making spam/ham decisions, which in fact is indistinguishable from zero as independent rules. However, that remains a small mitigation relative the Bayes rules, which are much more reliable but still subject to error by their nature as statistically-derived values. This is consistent with your shown score and a reasonable understanding of spam.

On the other hand, if you really trust your Bayes DB and have a particular widespread flavor of spam hitting you, that precise set of rules (including HTML_MESSAGE) makes an excellent 'meta' rule worth a solid half point, and if you don't have a lot of non-spam marketing mail that you get voluntarily, you can probably lower your threshold to 4.5 or maybe even 4. Try this first on a personal mail server, NOT on one handling mail for a broad audience including people who can fire you (until after you've analyzed the mail stream very carefully.)






Reply via email to