Re: quirks with bayes ?

Lucio Chiappetti Tue, 31 Mar 2009 08:39:56 -0700

On Tue, 31 Mar 2009, John Hardin wrote:

On Tue, 31 Mar 2009, Lucio Chiappetti wrote:

 users MAY forward spam ... to ... a daily crontab ... for sa-learn ...
Do you retain those messages? If not, you have no way to review how SA hasbeen manually trained.

NO. We did an initial training a couple of years ago, with some 2000 spam(collected during a few weeks before installation) and some 2000 ham(collected from the mail archives of a few users). Then we have this in

/etc/mail/spamassassin/local.cf

bayes_auto_learn                1
bayes_learn_to_journal          1
bayes_learn_during_report       1

We do keep for 7 days the quarantined spam (and for this we haveessentially no false positives).

The false negatives are instead passed voluntarily by a few users to thedaily crontab, but are deleted afterwards.

 0.000          0      31125          0  non-token data: nspam
 0.000          0     239162          0  non-token data: nham

Your bayes is trained with a strong bias towards ham. It should be more theother way, since the raw volume of email is biased towards spam.

I suggest you also consider either disabling autolearn, or push thelearn-as-ham threshold lower.


I would be glad to do the latter, if I knew where to find such threshold.

There is nothing like that in /etc/mail/spamassassin/local.cf, nor I canfind any doc to configuration parameters on the wiki site.


Would that be one of those two in /usr/share/spamassassin/10_misc.cf ?

bayes_auto_learn_threshold_nonspam      0.1
bayes_auto_learn_threshold_spam         12.0


--
Lucio Chiappetti - INAF/IASF - via Bassini 15 - I-20133 Milano (Italy)
For more info : http://www.iasf-milano.inaf.it/~lucio/personal.html
-----------------------------------------------------------------------
"Nature" on government cuts to research       http://snipurl.com/4erid
"Nature" e i tagli del governo alla ricerca   http://snipurl.com/4erko

Re: quirks with bayes ?

Reply via email to