[SAtalk] Re: Some questions about 2.60-rc2

Theo Van Dinter Mon, 25 Aug 2003 04:15:03 +0000

On Mon, Aug 25, 2003 at 03:36:50PM +1200, Simon Byrnand wrote:
> Ok. Did the statistics file give any suggestion of what kind of balance 
> between spam and ham would get autolearnt with those thresholds ? Is the


Have you looked at the STATISTICS* files?

> new Bayes algorithm any more resistant to being skewed by learning a lot 
> more ham than spam ? (Which is what tended to happen with 0.1 and 12 under 
> 2.55 anyway, I ended up changing 0.1 to -1 because the ham learnt was 
> outweighing spam by nearly 5 to 1)

All Bayes systems will get skewed if you bias the learning one way or
the other, 2.60 hasn't changed that.

As for the autolearn values, people may have to change it (and other
default values) as necessary.  We try to make the defaults generically
good for everyone, but everyone's situation is different. :)

> Ok, I can understand that.... guess I'll have to rework my system a bit to 
> work around it... for now I'll drop the threshold to 50... assuming it wont 
> be reduced to less than that later on ? :)

If I could tell the future, I'd be winning lotteries left and right. ;)

I would assume it won't be reduced below 50.  The idea was that 50
should be high enough for anyone to know "ok, this is spam, really."
I can't see why we'd lower it right now.

-- 
Randomly Generated Tagline:
"The one computer-language course I took was Cobol, and basically, I just
 slept the whole quarter. Then, the night before the final, I read the IBM
 Cobol manual, and I got the top score in the final." - Larry Wall

pgp00000.pgp
Description: PGP signature

[SAtalk] Re: Some questions about 2.60-rc2

Reply via email to