Re: help with training bayesian filter

Loren Wilton Wed, 17 Oct 2007 23:08:26 -0700

I think the first things I'd do would be to make some adjustments to thesettings:

bayes_auto_learn_threshold_nonspam  0.2
bayes_min_ham_num 200


And probably leave the rest the same.

Then I'd train on 200 hams, which you can go back into history to get; yourham messages probably don't change much year to year.

Also train at least 200 spams, which should be easy. In this case thoughyou want recent junk, not somethign from 6 months ago. If it takes 6 daysuntil you have enough spam to trigger bayes, that's fine, just wait for it.

At that point Bayes should kick in. Now you get to the hard part. You needto watch Bayes like a hawk for a few weeks to make sure you really got ittrained right! If you do this, and feed it corrections when you don't likehow it scored a ham or spam, you will be fine. If you *don't* do this, youwill probably end up with Bayes going odd on a tangent, and you may end upwith a database that is so badly trashed you will have to throw it away andstart over.

But this watching closely business and feeding in corrections to get thingsright should only take a few weeks at most, unless the kind of mail you getchanges. I've had bayes running for years on the same database, and quitehonestly I haven't had to train a message in probably a year now. I alsodon't run auto-learning, and it is still giving me bayes_99 on my spams andnumbers around 0 to 10 on my hams. I guess that means my message typesdon't change much. ;-)


       Loren

----- Original Message -----From: "sinnerman" <[EMAIL PROTECTED]>

To: <users@spamassassin.apache.org>
Sent: Wednesday, October 17, 2007 8:49 PM
Subject: Re: help with training bayesian filter


I'm running spamd as:

spamd -d -l -u nobody --siteconfigpath=<my site config's path>

My config file is:

required_hits   4
bayes_auto_learn_threshold_nonspam  1
bayes_auto_learn_threshold_spam     8
bayes_min_ham_num 100
score BAYES_99 5

I don't have bayes_auto_learn set explicitly, but the docs indicate that
enabled is the default setting.


Mr. Gus wrote:


I have a systemwide config so I don't know from experience, but are you
running spamd with -x or setting the user with -u? Because if you are,
that
might be mucking you up.

Do you have bayes_auto_learn set? That's what turns it on/off.

--
Gus

--

View this message in context:http://www.nabble.com/help-with-training-bayesian-filter-tf4643977.html#a13267625Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

Re: help with training bayesian filter

Reply via email to