> -----Original Message-----
> From: S. M. C. Butler
> Sent: Sunday, December 28, 2003 7:51 PM
> 
[...] 
> [Simon] I get about 50 spams a day and maybe 10 regular emails of which
> 4 are under the -1 threshold for ham. It's going to be somewhat
> difficult to get even close to parity for my spam/ham count.
> 

That's not atypical. For example,

% sa-learn --dump magic
0.000          0          2          0  non-token data: bayes db version
0.000          0      47980          0  non-token data: nspam
0.000          0      16783          0  non-token data: nham
[...]

There about 3x the spam messages in my Bayes database than ham, yet the
Bayes scoring system is working well. I did initially train the Bayes
scoring system (months ago) by sending an equal number (5000/so of each)
ham and spam that I'd sorted through by hand. This helps a lot, because
you don't have to depend upon auto-learn only triggering at certain levels.
It is after all, the mail that is close to the cut-off that needs to be
descriminated the most accurately, and only the user can make that
determination.



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to