Re: Seeking advice re: SA score discrepancies

Jerry Pape Sun, 17 Oct 2010 22:44:13 -0700

Oops, further investigation indicates that Bayes is "on"--thought thedefault was "off" for my config. I would be inclined to turn it off as Ihave no decent way of teaching it beyond mass-config into thefuture--please advise.

JP


On 10/17/10 10:37 PM, Jerry Pape wrote:

Wow, I am grateful for the prompt answers, but I must say they haveconfused me.
Bayes should not be on in my config and subsequent check of the GUIsays its not--this may be wrong.
Further, what are the "scoreset" indexes?
I don't use Bayes because all of my clients are POP mail and they areneither smart|committed enough to mail back ham/spam to educate thesystem.
Additionally, when I used Bayes way back when (without manualpopulation) and simply allowed auto-population to occur, I ended upwith enormous.spamassassin sub-files that rapidly eclipsed 50% of the client's diskquota.
I am certain that I am missing critical configurational understandingand optimizations, butuntil your lot kindly educates me--it is what it is and my initialdilemma remains unresolved.
JP

On 10/17/10 7:01 PM, John Hardin wrote:
On Sun, 17 Oct 2010, Jerry Pape wrote:
[Not sure if this is the right place to send this--please correct meif I am in error]
This is the place.
Assessment of this header athttp://www.futurequest.net/docs/SA/decode/ yields:
Test     Score     Description
BAYES_40     0.000     Bayesian spam probability is 20 to 40%
HTML_IMAGE_RATIO_02 0.550 HTML has a low ratio of text toimage area
HTML_MESSAGE     0.001     HTML included in message
HTML_MIME_NO_HTML_TAG 1.052 HTML-only message, but there isno HTML tag
MIME_HTML_ONLY     1.672     Message only has text/html MIME parts
RDNS_NONE 0.100 Delivered to trusted network by a host withno rDNS
URIBL_BLACK     1.961     Contains an URL listed in the URIBL blacklist
Total:     5.336

Clearly 5.336 does not equal 3.8.
There are four score sets to choose from based on what options youhave enabled. The above is for scoreset 2, no BAYES + net tests.Scoreset 3, BAYES + net tests, gives:
  HTML_MIME_NO_HTML_TAG  0.097
  MIME_HTML_ONLY_MULTI   0.001
  HTML_IMAGE_RATIO_02    0.383
  HTML_MESSAGE           0.001
  MIME_HTML_ONLY         1.457
  BAYES_40              -0.185
  URIBL_BLACK            1.955
  RDNS_NONE              0.1
                        -------
                         3.809

These are all of the default scores, and match what you're seeing.
I have no idea how to regress and resolve this problem.
First off, you need to review your Bayes training. An obviouslyspammy message shouldn't be hitting BAYES_40. Properly-trained Bayes,hitting BAYES_99, would have scored 7.494 on that message.
For analysis in general...

This will put the individual rule scores into the headers:
add_header all Status "_YESNO_, score=_SCORE_ required=_REQD_tests=_TESTSSCORES_ autolearn=_AUTOLEARN_ version=_VERSION_"
"spamassassin --debug area=rules <test_msg_file" is often helpful.

However:
The nature of spam changes over time. 3.2, which is only gettingcritical bug fixes now, will become steadily less effective the moretime passes and the spammers evolve new tricks. It's getting to thepoint that you should really consider upgrading to the latest 3.3release.

Re: Seeking advice re: SA score discrepancies

Reply via email to