Re: [SAtalk] Auto-learn bug

Theo Van Dinter Wed, 31 Dec 2003 19:12:46 -0800

On Wed, Dec 31, 2003 at 03:39:12PM +0100, Csaba Kiss wrote:
> debug: auto-learn: currently using scoreset 3.  recomputing score based 
> on scoreset 1.
> debug: Score set 1 chosen.
> debug: auto-learn: original score: 0.1, recomputed score: 0.001
> debug: Score set 3 chosen.
> debug: auto-learn? yes, ham (0.001 < 0.1)
> debug: Learning Ham
> X-Spam-Status: No, hits=5.5 required=6.0 tests=BAYES_99,HTML_MESSAGE
> 
> Just to explain briefly. The bayesian filter identified the e-mail as 
> 100% spam, yet it somehow converted the spam=1 result to 0.001 and 
> learnt is as ham. YOu can see that on the bottom there.


Well, Bayes says it's 100% spam, which doesn't count towards autolearn
decisions.    As the debug output says, the original score (scoreset
3, with no learn or userconf rules applied), the message scores 0.1
(HTML_MESSAGE).  In scoreset 1 (used to determine autolearn), the
message only scores 0.001 (HTML_MESSAGE).  It then compares the 0.001
score to the required ham autolearn value of 0.1, determines the messages
scores lower, and promptly learns it as ham.  No bug, works as designed.
(this should be in the docs and/or the faq btw...)

There's been discussion about having to have both the original and
recomputed score over/under the spam/ham autolearn score before it'll
actually autolearn, but we haven't really done anything with that yet.

-- 
Randomly Generated Tagline:
"After "Happy Gilmore" I thought I'd get lots of movie parts.  But all
 the roles I want go to Brad Pitt and Tom Cruise." - Bob Barker

pgp00000.pgp
Description: PGP signature

Re: [SAtalk] Auto-learn bug

Reply via email to