I think the first things I'd do would be to make some adjustments to the
settings:
bayes_auto_learn_threshold_nonspam 0.2
bayes_min_ham_num 200
And probably leave the rest the same.
Then I'd train on 200 hams, which you can go back into history to get; your
ham messages probably don't change much year to year.
Also train at least 200 spams, which should be easy. In this case though
you want recent junk, not somethign from 6 months ago. If it takes 6 days
until you have enough spam to trigger bayes, that's fine, just wait for it.
At that point Bayes should kick in. Now you get to the hard part. You need
to watch Bayes like a hawk for a few weeks to make sure you really got it
trained right! If you do this, and feed it corrections when you don't like
how it scored a ham or spam, you will be fine. If you *don't* do this, you
will probably end up with Bayes going odd on a tangent, and you may end up
with a database that is so badly trashed you will have to throw it away and
start over.
But this watching closely business and feeding in corrections to get things
right should only take a few weeks at most, unless the kind of mail you get
changes. I've had bayes running for years on the same database, and quite
honestly I haven't had to train a message in probably a year now. I also
don't run auto-learning, and it is still giving me bayes_99 on my spams and
numbers around 0 to 10 on my hams. I guess that means my message types
don't change much. ;-)
Loren
----- Original Message -----
From: "sinnerman" <[EMAIL PROTECTED]>
To: <users@spamassassin.apache.org>
Sent: Wednesday, October 17, 2007 8:49 PM
Subject: Re: help with training bayesian filter
I'm running spamd as:
spamd -d -l -u nobody --siteconfigpath=<my site config's path>
My config file is:
required_hits 4
bayes_auto_learn_threshold_nonspam 1
bayes_auto_learn_threshold_spam 8
bayes_min_ham_num 100
score BAYES_99 5
I don't have bayes_auto_learn set explicitly, but the docs indicate that
enabled is the default setting.
Mr. Gus wrote:
I have a systemwide config so I don't know from experience, but are you
running spamd with -x or setting the user with -u? Because if you are,
that
might be mucking you up.
Do you have bayes_auto_learn set? That's what turns it on/off.
--
Gus
--
View this message in context:
http://www.nabble.com/help-with-training-bayesian-filter-tf4643977.html#a13267625
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.