On Mon, Aug 25, 2003 at 03:36:50PM +1200, Simon Byrnand wrote:
> Ok. Did the statistics file give any suggestion of what kind of balance 
> between spam and ham would get autolearnt with those thresholds ? Is the 

Have you looked at the STATISTICS* files?

> new Bayes algorithm any more resistant to being skewed by learning a lot 
> more ham than spam ? (Which is what tended to happen with 0.1 and 12 under 
> 2.55 anyway, I ended up changing 0.1 to -1 because the ham learnt was 
> outweighing spam by nearly 5 to 1)

All Bayes systems will get skewed if you bias the learning one way or
the other, 2.60 hasn't changed that.

As for the autolearn values, people may have to change it (and other
default values) as necessary.  We try to make the defaults generically
good for everyone, but everyone's situation is different. :)

> Ok, I can understand that.... guess I'll have to rework my system a bit to 
> work around it... for now I'll drop the threshold to 50... assuming it wont 
> be reduced to less than that later on ? :)

If I could tell the future, I'd be winning lotteries left and right. ;)

I would assume it won't be reduced below 50.  The idea was that 50
should be high enough for anyone to know "ok, this is spam, really."
I can't see why we'd lower it right now.

-- 
Randomly Generated Tagline:
"The one computer-language course I took was Cobol, and basically, I just
 slept the whole quarter. Then, the night before the final, I read the IBM
 Cobol manual, and I got the top score in the final." - Larry Wall

Attachment: pgp00000.pgp
Description: PGP signature

Reply via email to