On Sat, Nov 25, 2006 at 01:41:50PM -0500, Jason Frisvold wrote: > With respect to bayes_tok though, can that be trimmed at all with > minimal impact? 3GB is a tad large for the database, though I guess > that depends on the number of users. I can't think of any way to > limit that, though, and I wonder how even larger entities can deal > with databases that much be much larger.
It depends why the file is 3GB. Yes, that's *WAY* huge. So there's a few possibilities here: 1) You have a huge (HUGE) number of tokens. 2) It could be a sparse file, so "file size 3GB" does not mean "using 3GB on disk". 3) Something is crazy with your installed Berkeley DB libs that causes it to have huge files. So if you don't have a crazy huge number of tokens (on my system, ~500k tokens equates to ~10MB of DB fwiw), I'd look at the libdb/DB_File stuff. Converting to SQL may also be useful. -- Randomly Selected Tagline: "It's a good cause... Cause it's good...?" - Hardcore TV
pgpLonTeUiRyj.pgp
Description: PGP signature