Hello Simon,

Sunday, December 28, 2003, 7:51:15 PM, you wrote:

>> No.  You *do* need a minimum of 200 hams.  The reason behind this is
>> that for Bayes to work, it needs to know *both* what spam looks like
>> *and* what ham looks like so it can tell the difference.
>> 
>> But yes, it is best to have a vaguely equal number of both ham and
>> spam (but you can easily have more than twice as many ham as spam or
>> vice-versa and still have Bayes work well).

SMCB> [Simon] I get about 50 spams a day and maybe 10 regular emails of which
SMCB> 4 are under the -1 threshold for ham. It's going to be somewhat
SMCB> difficult to get even close to parity for my spam/ham count.

Don't worry too much about the ratio.  If you get 20 spam to 1 ham, learn
all of them. The important thing is to learn enough to let Bayes have
enough information to make an intelligent decision, and the MOST
important thing is to learn them correctly (do not learn ham as spam or
v.v.).

There are periods when I easily run 20 to 1 or 30 to 1 across the domains
I manage. I learn all of the email. Bayes is my friend.

Bob Menschel




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to