For the record I also began noticing many more 100% Bayesian matches after
upgrading from 2.55 to 2.6.  So far it hasn't resulted in any false
positives that I'm aware of, but it has left me feeling slightly uneasy.

Brian

-----Original Message-----
From: Ben Wing [mailto:[EMAIL PROTECTED] 
Sent: Friday, October 17, 2003 4:18 PM
To: [EMAIL PROTECTED]
Subject: [SAtalk] strange behavior of Bayesian analyzer in SA 2.6

hi.  i just upgraded from 2.53 to 2.6 and i'm seeing something rather odd
about the Bayesian results: nearly every one is almost exactly 0%, 50%, or
100%!  it's almost as if it's applying an extreme rounding function to the
actual result.
now, these are turning out so far to be accurate, but i'm still highly
distrustful of such "perfect" results.  this clustering happened the instant
i upgraded spam assassin -- in fact, one of the first messages i sent after
this [just a plain test message, Subject: test, Body: test or something of
that sort] got flagged as 100% spam too!  i then went ahead and deleted my
bayes database completely and reran sa-learn on a corpus of about 2000 hams
and 2000 spams, but it made no difference to the clustering.

one thing to note here: version 2.53 was actually compiled by my isp and
sitting in /usr/local/bin and such, whereas 2.6 was compiled by me and under
my home dir.  i'm not sure if that is at all significant.  my isp used the
following
local.cf:

----------------------------------------------------------------
# Add your own customisations to this file.  See 'man
Mail::SpamAssassin::Conf'
# for details of what can be tweaked.
#

## SIGNATURE HOSTS
## $SMEId: confluence/core/spamassassin/VERIO/local.cf,v 1.2 2003/03/26
21:48:14  scottw Exp $

## 2.43: do not rewrite the subject
rewrite_subject      0

## 2.43: put the SA report in 'X-' headers instead of rewriting the body
report_header        1

## 2.43: keep it short
use_terse_report     1

## 2.43: don't need stars
spam_level_stars     0

## 2.43: leave the message alone--let the client worry about it
defang_mime          0

## 2.50: don't write the report as a separate mime part
report_safe          0
----------------------------------------------------------------

on top of this i added

----------------------------------------------------------------
## OVERRIDE default site rules 2.43: don't keep it short
use_terse_report     0

always_add_report    1

## OVERRIDE default site rules 2.43: do need stars
spam_level_stars     1
----------------------------------------------------------------

and after upgrading, my local.cf contains nothing but comment lines, so the
format of the output messages has changed somewhat, but i can't see how this
can have any effect on the Bayes scores.

any ideas???

ben






-------------------------------------------------------
This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo The
Event For Linux Datacenter Solutions & Strategies in The Enterprise Linux in
the Boardroom; in the Front Office; & in the Server Room
http://www.enterpriselinuxforum.com
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


-------------------------------------------------------
This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo
The Event For Linux Datacenter Solutions & Strategies in The Enterprise 
Linux in the Boardroom; in the Front Office; & in the Server Room 
http://www.enterpriselinuxforum.com
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to