Randy Ramsdell wrote:
Hi,
One thing I do not understand regarding AWL and BAYES. When a message
is reported to me as spam and was not marked as spam, I test is using
debug before and after sa-learn. Each time I do this, BAYES_99 does
hit, but they will also include AWL.
1. Does anyone understand why this happens?
I assume you're asking about while the AWL appears. That's normal. The
first thing to realize is the AWL is *NOT* a whitelist. It's a
sender-based score averager. It has both white and blacklist effects.
If the current message scores higher than the past average for a
sender, the AWL will take points off, trying to "split the difference"
between the past and current scores.
Since you just sa-learned a message from a sender that's probably never
sent to you before, the score now is almost gaurnteed to be higher than
the first pass through, resulting in a negative AWL score.
However, that's not a problem. Note this message, even with the AWL,
didn't fall below the spam tag threshold. The AWL doesn't work on a
"good vs bad" senders basis, so just because it scores negative, it
doesn't mean the AWL thinks the message is nonspam.. in your example, it
just thought it was less spammy, but still spam.
You might want to read this wiki article for a better discussion of the
AWL's behaviors:
http://wiki.apache.org/spamassassin/AwlWrongWay
2. I also noticed that when using "spamassassin -D" on a message, I
sometimes see a nice report like below (2nd example) but other times
it doesn't show report formatted. Any ideas on this one?
SA won't generate a formatted report for a message below the spam tag
level. You can force it to do so by adding -t.
Here are an example of two spam report headers for the same message.
Before sa-learn:
X-Spam-Status: No, score=3.982 tagged_above=-9999 required=5
tests=[ADVANCE_FEE_1=0, BAYES_60=1, SUB_HELLO=2.141, UNDISC_RECIPS=0.841]
X-Spam-Score: 3.982
X-Spam-Level: ***
After sa-learn:
Content analysis details: (5.2 points, 5.0 required)
pts rule name description
---- ----------------------
--------------------------------------------------
2.1 SUB_HELLO Subject starts with "Hello"
0.8 UNDISC_RECIPS Valid-looking To "undisclosed-recipients"
3.5 BAYES_99 BODY: Bayesian spam probability is 99 to 100%
[score: 1.0000]
0.0 ADVANCE_FEE_1 Appears to be advance fee fraud (Nigerian 419)
-1.2 AWL AWL: From: address is in the auto white-list
Thanks,
Randy Ramsdell