On Thursday 20 September 2007 07:59, Graham Murray wrote: > "Loren Wilton" <[EMAIL PROTECTED]> writes: > > If tokens are expired from the DB based on time, and assuming *all* > > tokens older than some date are expired, wouldn't it be reasonable to > > prune bayes_seen to the expiry date after the expiry run? > > You cannot assume that all tokens earlier than some date have expired. A > token (in bayes_token) is only expired when its last occurrence in an > email was before the expiry interval. So it is perfectly possible for a > token from the very first email ever learnt to still be in bayes years > later.
It doesn't really matter whether the tokens have expired, I think. You probably don't want to relearn an old message anyway. The Bayes system can record the message date (e.g. from the top Received: field), expire messages older than a certain age, and refuse to learn older messages, unless explicitly overridden (for example when populating a clean bayes DB with an initial corpus). -- Magnus Holmgren [EMAIL PROTECTED] (No Cc of list mail needed, thanks)
pgp6jhYlXZsPa.pgp
Description: PGP signature