I think the first things I'd do would be to make some adjustments to the settings:

bayes_auto_learn_threshold_nonspam  0.2
bayes_min_ham_num 200

And probably leave the rest the same.

Then I'd train on 200 hams, which you can go back into history to get; your ham messages probably don't change much year to year.

Also train at least 200 spams, which should be easy. In this case though you want recent junk, not somethign from 6 months ago. If it takes 6 days until you have enough spam to trigger bayes, that's fine, just wait for it.

At that point Bayes should kick in. Now you get to the hard part. You need to watch Bayes like a hawk for a few weeks to make sure you really got it trained right! If you do this, and feed it corrections when you don't like how it scored a ham or spam, you will be fine. If you *don't* do this, you will probably end up with Bayes going odd on a tangent, and you may end up with a database that is so badly trashed you will have to throw it away and start over.

But this watching closely business and feeding in corrections to get things right should only take a few weeks at most, unless the kind of mail you get changes. I've had bayes running for years on the same database, and quite honestly I haven't had to train a message in probably a year now. I also don't run auto-learning, and it is still giving me bayes_99 on my spams and numbers around 0 to 10 on my hams. I guess that means my message types don't change much. ;-)

       Loren


----- Original Message ----- From: "sinnerman" <[EMAIL PROTECTED]>
To: <users@spamassassin.apache.org>
Sent: Wednesday, October 17, 2007 8:49 PM
Subject: Re: help with training bayesian filter



I'm running spamd as:

spamd -d -l -u nobody --siteconfigpath=<my site config's path>

My config file is:

required_hits   4
bayes_auto_learn_threshold_nonspam  1
bayes_auto_learn_threshold_spam     8
bayes_min_ham_num 100
score BAYES_99 5

I don't have bayes_auto_learn set explicitly, but the docs indicate that
enabled is the default setting.


Mr. Gus wrote:

I have a systemwide config so I don't know from experience, but are you
running spamd with -x or setting the user with -u? Because if you are,
that
might be mucking you up.

Do you have bayes_auto_learn set? That's what turns it on/off.

--
Gus



--
View this message in context: http://www.nabble.com/help-with-training-bayesian-filter-tf4643977.html#a13267625 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Reply via email to