On Sat, Nov 25, 2006 at 01:41:50PM -0500, Jason Frisvold wrote:
> With respect to bayes_tok though, can that be trimmed at all with
> minimal impact?  3GB is a tad large for the database, though I guess
> that depends on the number of users.  I can't think of any way to
> limit that, though, and I wonder how even larger entities can deal
> with databases that much be much larger.

It depends why the file is 3GB.  Yes, that's *WAY* huge.

So there's a few possibilities here:

1) You have a huge (HUGE) number of tokens.
2) It could be a sparse file, so "file size 3GB" does not mean "using
   3GB on disk".
3) Something is crazy with your installed Berkeley DB libs that causes
   it to have huge files.

So if you don't have a crazy huge number of tokens (on my system, ~500k tokens
equates to ~10MB of DB fwiw), I'd look at the libdb/DB_File stuff.  Converting
to SQL may also be useful.

-- 
Randomly Selected Tagline:
"It's a good cause... Cause it's good...?"      - Hardcore TV

Attachment: pgpLonTeUiRyj.pgp
Description: PGP signature

Reply via email to