> -----Messaggio originale----- > Da: Michael Parker [mailto:[EMAIL PROTECTED] > > In order to expire from bayes_seen you have to know that there are no > longer any tokens from a given msg in the bayes_token database. This > is > a hard problem, mapping tokens to msgs, so it wasn't done.
This could be achieved with a many-to-many table, mapping message IDs (bayes_seen entries) to their tokens (bayes_token entries). This many-to-many relation may be keyed on message ids only, by the way. Was this discarded because a many-to-many relation is regarded as overkilling? > Likewise no one ever did anything about expiring the bayes_seen > entries. I guess this would need a further key on bayes_seen: the time of insertion in the db. Was this discarded because the DB_File (and BerkeleyDB) doesn't allow for multiple keys on databases? It seems to me that most enhancements to the Bayes database would require switching to BerkeleyDB and waiting for a version implementing the secondary databases semantics of BerkeleyDB, otherwise most of them would be allowed only on SQL-based storage. Giampaolo > > Sounds like a good project, there might even be a bugzilla enhancement > opened already. > > Patches are welcome. > > Michael > > > > > Theo Van Dinter wrote: > >> On Wed, Sep 19, 2007 at 03:23:50PM -0600, Mr. Gus wrote: > >> > >>>> The file bayes_seen has grown in size to 256GB! (274992939008) > >>>> How do I cap the size limit of this file? I want to have it not > grow larger > >>>> then say 800mb at the most! > >>>> > >>> You need to expire old bayes tokens. The limit is set not as a > size, but as > >>> > >> Expiring bayes tokens does nothing to the bayes_seen file. There is > no expiry > >> for bayes_seen. > >> > >> If the seen file is bigger than you'd like, I'd just rm the file. > >> > >> > >