Re: [SAtalk] Re: Bayes database stats

2003-12-15 Thread Alexander Litvinov
> Plus, Alexander may have been just counting up the total Bayes directory > size, which could very likely bring it at least to 44. > LANG=C ls -lah ~/.spamassassin/ total 68M drwx--2 lan users 288 Dec 16 09:21 . drwx-- 73 lan users4.5k Dec 15 19:27 .. -rw--

Re: [SAtalk] Re: Bayes database stats

2003-12-15 Thread Alexander Litvinov
> How did you calculate those statistics -- does sa-learn do this, or did > you code up something for sa-learn's output? I have written small perl script for this. --- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in

[SAtalk] Re: Bayes database stats

2003-12-13 Thread Bryan Hoover
Matt Kettler wrote: > Well, 5mb * (743/100) = 37.15mb... that's pretty close to 44mb at an > estimate. Doesn't seem large at all given the specs.. Heh.. True that. My brain is *not* a arithmatic calculator :). Plus, Alexander may have been just counting up the total Bayes directory size, which

Re: [SAtalk] Re: Bayes database stats

2003-12-13 Thread Matt Kettler
At 02:45 PM 12/13/2003, Bryan Hoover wrote: > 742753 - total number of words in it > 515654 - total number of words which have been seen only once > 80485 - ... twice > 35325 - ... 3 times > > This statistics shows that most of the db us not used, just eating my hard drive (44 MB total size)

[SAtalk] Re: Bayes database stats

2003-12-13 Thread Bryan Hoover
Alexander Litvinov wrote: > > Today I have dumped my bayes db and calculate some statistics. > > 742753 - total number of words in it > 515654 - total number of words which have been seen only once > 80485 - ... twice > 35325 - ... 3 times > > This statistics shows that most of the db us n