For the record I also began noticing many more 100% Bayesian matches after upgrading from 2.55 to 2.6. So far it hasn't resulted in any false positives that I'm aware of, but it has left me feeling slightly uneasy.
Brian -----Original Message----- From: Ben Wing [mailto:[EMAIL PROTECTED] Sent: Friday, October 17, 2003 4:18 PM To: [EMAIL PROTECTED] Subject: [SAtalk] strange behavior of Bayesian analyzer in SA 2.6 hi. i just upgraded from 2.53 to 2.6 and i'm seeing something rather odd about the Bayesian results: nearly every one is almost exactly 0%, 50%, or 100%! it's almost as if it's applying an extreme rounding function to the actual result. now, these are turning out so far to be accurate, but i'm still highly distrustful of such "perfect" results. this clustering happened the instant i upgraded spam assassin -- in fact, one of the first messages i sent after this [just a plain test message, Subject: test, Body: test or something of that sort] got flagged as 100% spam too! i then went ahead and deleted my bayes database completely and reran sa-learn on a corpus of about 2000 hams and 2000 spams, but it made no difference to the clustering. one thing to note here: version 2.53 was actually compiled by my isp and sitting in /usr/local/bin and such, whereas 2.6 was compiled by me and under my home dir. i'm not sure if that is at all significant. my isp used the following local.cf: ---------------------------------------------------------------- # Add your own customisations to this file. See 'man Mail::SpamAssassin::Conf' # for details of what can be tweaked. # ## SIGNATURE HOSTS ## $SMEId: confluence/core/spamassassin/VERIO/local.cf,v 1.2 2003/03/26 21:48:14 scottw Exp $ ## 2.43: do not rewrite the subject rewrite_subject 0 ## 2.43: put the SA report in 'X-' headers instead of rewriting the body report_header 1 ## 2.43: keep it short use_terse_report 1 ## 2.43: don't need stars spam_level_stars 0 ## 2.43: leave the message alone--let the client worry about it defang_mime 0 ## 2.50: don't write the report as a separate mime part report_safe 0 ---------------------------------------------------------------- on top of this i added ---------------------------------------------------------------- ## OVERRIDE default site rules 2.43: don't keep it short use_terse_report 0 always_add_report 1 ## OVERRIDE default site rules 2.43: do need stars spam_level_stars 1 ---------------------------------------------------------------- and after upgrading, my local.cf contains nothing but comment lines, so the format of the output messages has changed somewhat, but i can't see how this can have any effect on the Bayes scores. any ideas??? ben ------------------------------------------------------- This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo The Event For Linux Datacenter Solutions & Strategies in The Enterprise Linux in the Boardroom; in the Front Office; & in the Server Room http://www.enterpriselinuxforum.com _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk ------------------------------------------------------- This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo The Event For Linux Datacenter Solutions & Strategies in The Enterprise Linux in the Boardroom; in the Front Office; & in the Server Room http://www.enterpriselinuxforum.com _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk