Re: Bayes advanced questions

Theo Van Dinter Thu, 11 May 2006 11:07:14 -0700

On Thu, May 11, 2006 at 06:17:14PM +0200, Michael Monnerie wrote:
> > > bayes: synced databases from journal in 11 seconds: 1968 unique
> > > entries (3059 total entries)
> > That's the journal sync, not the expiry part. The expiry part takes
> > much longer.
> 
> It comes from "sa-learn --force-expire --sync". How could I see when it 
> expires something? Could it be because the ntokens are still not 2 
> mio., that I don't have an expire?


Yes.  The expiry logic is well documented in the sa-learn POD, but the basics
are that:

You set the max db size to 2000000, which means that SA tries to expire down
to 2000000*0.75 = 1500000 tokens.  According to your post, you only have
1261864 tokens which is less than 1500000, so there's nothing to do for an
expiry.  You'd need a minimum of 1501000 tokens in the DB for an expire to
actually run.

> On http://wiki.apache.org/spamassassin/BayesForceExpire is says you 
> should stop SA before --force-expire, is that a must or a 
> recommendation? The man page doesn't ask for it.

It's completely unnecessary to stop SA (that'd be a horrible requirement
wouldn't it?).

> > >> score used is the score the message would have got if:
> > >>  bayes was disabled
> > >>  the AWL was disabled
> > >>  no userconf (ie:black/whitelists) rules were enabled.
> > >
> > > Thats good info which should be in the man page.
> >
> > It is.. In SA 3.1.x it's in the docs for the autolearn threshold
> > plugin:
> >
> > http://spamassassin.apache.org/full/3.1.x/dist/doc/Mail_SpamAssassin_
> >Plugin_AutoLearnThreshold.html
> 
> Not really. No mentioning that bayes/awl/userconf are not counted.

Really?  Did you look at the plugin POD?

       Note that certain tests are ignored when determining whether a message
should be trained upon:

       * rules with tflags set to ’learn’ (the Bayesian rules)
       * rules with tflags set to ’userconf’ (user configuration)
       * rules with tflags set to ’noautolearn’

> For example, the man page says:
> * Also note that auto-learning occurs using scores from either scoreset
> * 0 or 1
> 
> But who except the devs knows what's scoreset 0 or 1?

Anyone who's read the documentation for "score" ? ;)

-- 
Randomly Generated Tagline:
Fry: "They're great! They're like sex except I'm having them."

pgpwEB3dPMimZ.pgp
Description: PGP signature

Re: Bayes advanced questions

Reply via email to