On Tue, 2003-08-19 at 17:03, Chris Barnes wrote:
> Yorkshire Dave <[EMAIL PROTECTED]> wrote:
> > I'd figure it's better to use your recent ham, the same as with spam,
> > that way your bayes database contains tokens from what's happening now
> > as opposed to what happened years ago.
> 
> I understand feeding it the recent spam, but see no disadvantages (and
> some possible advatages) to feeding it all the ham.
> 

I'm lead to believe it works best when it's fed a balanced diet. I have
one or two that are way out of balance and they still work fine, I have
no idea at what point it breaks down badly. You could always try it and
see, if you don't like the results you can delete your bayes database
and start over.

For the really old ham though, you'll probably be learning a lot of
tokens which are no longer relevant. It might give you a fat slow bayes
database but if you're on sleek fast hardware that might not bother you.

If you want it to work right from the get-go, keeping it in balance is
probably best, but there's no harm in doing a little experimenting. Why
not try it both ways around, do a comparison and tell us all what you
find :)

-- 
Yorkshire Dave


-- 
Scanned by MailScanner at wot.no-ip.com



-------------------------------------------------------
This SF.net email is sponsored by Dice.com.
Did you know that Dice has over 25,000 tech jobs available today? From
careers in IT to Engineering to Tech Sales, Dice has tech jobs from the
best hiring companies. http://www.dice.com/index.epl?rel_code=104
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to