Yes indeedy. And I've been looking at Bayes scores here just a wee bit.
BAYES_99 just does not hit on ham and hits on high percentages of spam.
Even BAYES_95 does not hit ham. I go down to BAYES_80 before I hit 0.05
percent of ham. I am toying with the idea of recognizing this feature and
tweaking my already slightly modified BAYES rules a little further. I
raised 99 to 5 points already. I am thinking of moving 95 up to 4.5 and
80 up a point or so. If I get no more false tags than I have now (chiefly
private email spam discussions and some LKML postings, then I will
conclude that the "theoretical" treatment used for setting Bayes scores
in SA needs some thoughtful reevaluation. I understand the concept of the
math involved in the scoring. But I suspect the assumptions made are a
little off kilter. Of course, I have a very carefully nurtured BAYES.

And of course, SARE rules are still needed. (Bob, I have a gem for you.
It was Base64 encoded with four characters per line. I suspect it is a
virus. It's .vbe labeled. No WAY I an going to run something from Cuba
with a .vbe suffix on a Windows machine. I am not that stooooopid.)

{^_^}
----- Original Message ----- From: "Pierre Thomson" <[EMAIL PROTECTED]>


I am continually amazed at the ability of the Bayesian engine to recognize garbage. Those who think they can "poison" a Bayes DB with meaningless text are deluded.

Here's a snip of spamassassin -t on one of today's spams, with nothing but a URL, an inline gif and random words. (SA 2.64)

Content preview:  URI:http://vn18in04j7i0dddnygdqivvd.nefsegmhb.com/
 URI:cid:794dfa4f13@mindspring.com Week organ material sing, dog first.
 Cut sun pay, story should go, love. Put fight team. Free practice voice
 body, will. His or room color left hope. Condition thousand minute most
 more. Night, end, center very soon need street. Though, test can
 enough, to earth strange. Large own some race book. Far, land five.
 Since, made from. Strange house forest family. Back lay knew me country
 tree. [...]

Content analysis details:   (9.0 points, 6.0 required)

pts rule name              description
---- ---------------------- --------------------------------------------------
4.0 BAYES_99               BODY: Bayesian spam probability is 99 to 100%
                           [score: 0.9994]
1.0 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
0.1 HTML_MESSAGE           BODY: HTML included in message
1.0 HTML_IMAGE_ONLY_04     BODY: HTML: images with 200-400 bytes of words
1.7 PT_LMS                 URI: long-medium-short URI
1.2 PRIORITY_NO_NAME       Message has priority setting, but no X-Mailer


Pierre Thomson
BIC

Reply via email to