Hi,

Loren Wilton wrote:
You are into the land of opinions here, so you will get different answers.

<SNIP>

Once you have the basic stuff I personally prefer to leave auto-learning turned off and only had Bayes hams and spams that might be misclassified, or ones where the bayes score isn't high enough in the appropriate direction. Others may want to do things differently. Personally I'd say that you REALLY should turn off auto-learning at the start, until you have got Bayes a good start in life by hand. Once you have it working and you are happy with it you may want to turn auto-learning back on, or may not. If you do turn it back on, you probably want to set bayes-ham-threshold (or whatever the name really is) to around -.1 rather than the default value.

I entirely agree about turning auto-learning off until you are happy with that Bayes is working pretty well for you. If you do turn on auto-learning it is vital that you adjust the thresholds. These are my values:

bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 12.0


This has worked really well for me, with a site-wide Bayes database, which I manually learn by mistake. I also occasionally learn a handful of hams to keep them up to date.






Loren

    ----- Original Message -----
    *From:* Leigh Sharpe <mailto:[EMAIL PROTECTED]>
    *To:* users <mailto:users@spamassassin.apache.org>
    *Sent:* Thursday, June 29, 2006 4:45 PM
    *Subject:* Training Bayes properly

    So it looks like I have to reset my Bayes and re-train it. I want to
    do it properly this time. I will be making sure I personally review
    every message that our users put into the spam folder first, to make
    sure they haven't put spam into the wrong folder. However, I have a
    couple of questions:
1) Am I better off to feed it a few emails a day, or wait until I
    get a few hundred, then feed them all to sa-learn at once? Is there
    really a difference?
    2) How many spams should I feed it? I've heard in some places that
    200 is OK, I've heard elsewhere that 10000 or more are needed.
    3) Just how 'balanced' should it's diet be? Should I use the same
    quantity of ham as spam, or can I get away with less ham than spam?
Regards,
                 Leigh
Leigh Sharpe
    Network Systems Engineer
    Pacific Wireless
    Ph +61 3 9584 8966
    Mob 0408 009 502
    email [EMAIL PROTECTED]
    <blocked::mailto:[EMAIL PROTECTED]>
    web www.pacificwireless.com.au
    <blocked::http://www.pacificwireless.com.au/>


--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:    http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have  an apple and we  exchange apples
then you and I will still each have  one apple. But  if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw

Reply via email to