Hi,
Loren Wilton wrote:
You are into the land of opinions here, so you will get different answers.
<SNIP>
Once you have the basic stuff I personally prefer to leave auto-learning
turned off and only had Bayes hams and spams that might be
misclassified, or ones where the bayes score isn't high enough in the
appropriate direction. Others may want to do things differently.
Personally I'd say that you REALLY should turn off auto-learning at the
start, until you have got Bayes a good start in life by hand. Once you
have it working and you are happy with it you may want to turn
auto-learning back on, or may not. If you do turn it back on, you
probably want to set bayes-ham-threshold (or whatever the name really
is) to around -.1 rather than the default value.
I entirely agree about turning auto-learning off until you are happy
with that Bayes is working pretty well for you. If you do turn on
auto-learning it is vital that you adjust the thresholds. These are my
values:
bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 12.0
This has worked really well for me, with a site-wide Bayes database,
which I manually learn by mistake. I also occasionally learn a handful
of hams to keep them up to date.
Loren
----- Original Message -----
*From:* Leigh Sharpe <mailto:[EMAIL PROTECTED]>
*To:* users <mailto:users@spamassassin.apache.org>
*Sent:* Thursday, June 29, 2006 4:45 PM
*Subject:* Training Bayes properly
So it looks like I have to reset my Bayes and re-train it. I want to
do it properly this time. I will be making sure I personally review
every message that our users put into the spam folder first, to make
sure they haven't put spam into the wrong folder. However, I have a
couple of questions:
1) Am I better off to feed it a few emails a day, or wait until I
get a few hundred, then feed them all to sa-learn at once? Is there
really a difference?
2) How many spams should I feed it? I've heard in some places that
200 is OK, I've heard elsewhere that 10000 or more are needed.
3) Just how 'balanced' should it's diet be? Should I use the same
quantity of ham as spam, or can I get away with less ham than spam?
Regards,
Leigh
Leigh Sharpe
Network Systems Engineer
Pacific Wireless
Ph +61 3 9584 8966
Mob 0408 009 502
email [EMAIL PROTECTED]
<blocked::mailto:[EMAIL PROTECTED]>
web www.pacificwireless.com.au
<blocked::http://www.pacificwireless.com.au/>
--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW: http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have an apple and we exchange apples
then you and I will still each have one apple. But if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw