I would guess this is normal.  Think of things like Message-Id's, vs. common
words like "the" which will appear very many times.

> -----Original Message-----
> From: Alexander Litvinov [mailto:[EMAIL PROTECTED]
> Sent: Saturday, December 13, 2003 10:10 AM
> To: [EMAIL PROTECTED]
> Subject: [SAtalk] Bayes database stats
> 
> 
> Today I have dumped my bayes db and calculate some statistics.
> 
>  742753 - total number of words in it
>  515654 - total number of words which have been seen only once
>   80485 - ... twice
>   35325 - ... 3 times
> 
> This statistics shows that most of the db us not used, just 
> eating my hard drive (44 MB total size). Is it normal situation ?


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to