On Thursday 20 September 2007 07:59, Graham Murray wrote:
> "Loren Wilton" <[EMAIL PROTECTED]> writes:
> > If tokens are expired from the DB based on time, and assuming *all*
> > tokens older than some date are expired, wouldn't it be reasonable to
> > prune bayes_seen to the expiry date after the expiry run?
>
> You cannot assume that all tokens earlier than some date have expired. A
> token (in bayes_token) is only expired when its last occurrence in an
> email was before the expiry interval. So it is perfectly possible for a
> token from the very first email ever learnt to still be in bayes years
> later.

It doesn't really matter whether the tokens have expired, I think. You 
probably don't want to relearn an old message anyway.

The Bayes system can record the message date (e.g. from the top Received: 
field), expire messages older than a certain age, and refuse to learn older 
messages, unless explicitly overridden (for example when populating a clean 
bayes DB with an initial corpus).

-- 
Magnus Holmgren        [EMAIL PROTECTED]
                       (No Cc of list mail needed, thanks)

Attachment: pgp6jhYlXZsPa.pgp
Description: PGP signature

Reply via email to