On Mon, 29 Mar 2010 13:03:59 +0200
Kai Schaetzl <mailli...@conactive.com> wrote:

> Alex wrote on Sun, 28 Mar 2010 13:38:25 -0400:
> 
> > I have a bayes db that's about 160MB with a 40MB token db on a
> > system with about 100k messages per day.
> 
> Well, what's the missing 120 MB? The journal? Do a complete sync and
> then delete it.
 
Probably the signatures in bayes_seen - there's no mechanism for ageing
them out.

> You should be
> aware that the expiry kicks in at 75%, not at 100% of max_db_size.

And it may reduce the tokens to 37.5% of nominal

> I suggest you change to SQL. This eliminates the journal.

Isn't that slower than journalled  db?


> > database was too big, so I lowered it back down, but I think that
> > was a mistake.
> 
> "too big" is not an absolute figure. If you store 1-occurence tokens
> you will obviously have more tokens than without them.

There's not really a choice since all tokens start that way.

> You should use autolearn if you don't do yet. 

Autolearning can make things worse by dropping the retention period.

Reply via email to