Oops, further investigation indicates that Bayes is "on"--thought the
default was "off" for my config. I would be inclined to turn it off as I
have no decent way of teaching it beyond mass-config into the
future--please advise.
JP
On 10/17/10 10:37 PM, Jerry Pape wrote:
Wow, I am grateful for the prompt answers, but I must say they have
confused me.
Bayes should not be on in my config and subsequent check of the GUI
says its not--this may be wrong.
Further, what are the "scoreset" indexes?
I don't use Bayes because all of my clients are POP mail and they are
neither smart|committed enough to mail back ham/spam to educate the
system.
Additionally, when I used Bayes way back when (without manual
population) and simply allowed auto-population to occur, I ended up
with enormous
.spamassassin sub-files that rapidly eclipsed 50% of the client's disk
quota.
I am certain that I am missing critical configurational understanding
and optimizations, but
until your lot kindly educates me--it is what it is and my initial
dilemma remains unresolved.
JP
On 10/17/10 7:01 PM, John Hardin wrote:
On Sun, 17 Oct 2010, Jerry Pape wrote:
[Not sure if this is the right place to send this--please correct me
if I am in error]
This is the place.
Assessment of this header at
http://www.futurequest.net/docs/SA/decode/ yields:
Test Score Description
BAYES_40 0.000 Bayesian spam probability is 20 to 40%
HTML_IMAGE_RATIO_02 0.550 HTML has a low ratio of text to
image area
HTML_MESSAGE 0.001 HTML included in message
HTML_MIME_NO_HTML_TAG 1.052 HTML-only message, but there is
no HTML tag
MIME_HTML_ONLY 1.672 Message only has text/html MIME parts
RDNS_NONE 0.100 Delivered to trusted network by a host with
no rDNS
URIBL_BLACK 1.961 Contains an URL listed in the URIBL blacklist
Total: 5.336
Clearly 5.336 does not equal 3.8.
There are four score sets to choose from based on what options you
have enabled. The above is for scoreset 2, no BAYES + net tests.
Scoreset 3, BAYES + net tests, gives:
HTML_MIME_NO_HTML_TAG 0.097
MIME_HTML_ONLY_MULTI 0.001
HTML_IMAGE_RATIO_02 0.383
HTML_MESSAGE 0.001
MIME_HTML_ONLY 1.457
BAYES_40 -0.185
URIBL_BLACK 1.955
RDNS_NONE 0.1
-------
3.809
These are all of the default scores, and match what you're seeing.
I have no idea how to regress and resolve this problem.
First off, you need to review your Bayes training. An obviously
spammy message shouldn't be hitting BAYES_40. Properly-trained Bayes,
hitting BAYES_99, would have scored 7.494 on that message.
For analysis in general...
This will put the individual rule scores into the headers:
add_header all Status "_YESNO_, score=_SCORE_ required=_REQD_
tests=_TESTSSCORES_ autolearn=_AUTOLEARN_ version=_VERSION_"
"spamassassin --debug area=rules <test_msg_file" is often helpful.
However:
The nature of spam changes over time. 3.2, which is only getting
critical bug fixes now, will become steadily less effective the more
time passes and the spammers evolve new tricks. It's getting to the
point that you should really consider upgrading to the latest 3.3
release.