> -----Messaggio originale-----
> Da: Michael Parker [mailto:[EMAIL PROTECTED]
> 
> In order to expire from bayes_seen you have to know that there are no
> longer any tokens from a given msg in the bayes_token database.  This
> is
> a hard problem, mapping tokens to msgs, so it wasn't done.

This could be achieved with a many-to-many table, mapping message IDs
(bayes_seen entries) to their tokens (bayes_token entries). This
many-to-many relation may be keyed on message ids only, by the way.

Was this discarded because a many-to-many relation is regarded as
overkilling?


> Likewise no one ever did anything about expiring the bayes_seen
> entries.

I guess this would need a further key on bayes_seen: the time of insertion
in the db. Was this discarded because the DB_File (and BerkeleyDB) doesn't
allow for multiple keys on databases?

It seems to me that most enhancements to the Bayes database would require
switching to BerkeleyDB and waiting for a version implementing the secondary
databases semantics of BerkeleyDB, otherwise most of them would be allowed
only on SQL-based storage.

Giampaolo

> 
> Sounds like a good project, there might even be a bugzilla enhancement
> opened already.
> 
> Patches are welcome.
> 
> Michael
> 
> 
> 
> > Theo Van Dinter wrote:
> >> On Wed, Sep 19, 2007 at 03:23:50PM -0600, Mr. Gus wrote:
> >>
> >>>> The file bayes_seen has grown in size to 256GB!  (274992939008)
> >>>> How do I cap the size limit of this file? I want to have it not
> grow larger
> >>>> then say 800mb at the most!
> >>>>
> >>> You need to expire old bayes tokens. The limit is set not as a
> size, but as
> >>>
> >> Expiring bayes tokens does nothing to the bayes_seen file.  There is
> no expiry
> >> for bayes_seen.
> >>
> >> If the seen file is bigger than you'd like, I'd just rm the file.
> >>
> >>
> >

Reply via email to