On Tue, 31 Mar 2009, John Hardin wrote:
On Tue, 31 Mar 2009, Lucio Chiappetti wrote:

 users MAY forward spam ... to ... a daily crontab ... for sa-learn ...

Do you retain those messages? If not, you have no way to review how SA has been manually trained.

NO. We did an initial training a couple of years ago, with some 2000 spam (collected during a few weeks before installation) and some 2000 ham (collected from the mail archives of a few users). Then we have this in
/etc/mail/spamassassin/local.cf

bayes_auto_learn                1
bayes_learn_to_journal          1
bayes_learn_during_report       1

We do keep for 7 days the quarantined spam (and for this we have essentially no false positives).

The false negatives are instead passed voluntarily by a few users to the daily crontab, but are deleted afterwards.

 0.000          0      31125          0  non-token data: nspam
 0.000          0     239162          0  non-token data: nham

Your bayes is trained with a strong bias towards ham. It should be more the other way, since the raw volume of email is biased towards spam.

I suggest you also consider either disabling autolearn, or push the learn-as-ham threshold lower.

I would be glad to do the latter, if I knew where to find such threshold.
There is nothing like that in /etc/mail/spamassassin/local.cf, nor I can find any doc to configuration parameters on the wiki site.

Would that be one of those two in /usr/share/spamassassin/10_misc.cf ?

bayes_auto_learn_threshold_nonspam      0.1
bayes_auto_learn_threshold_spam         12.0


--
Lucio Chiappetti - INAF/IASF - via Bassini 15 - I-20133 Milano (Italy)
For more info : http://www.iasf-milano.inaf.it/~lucio/personal.html
-----------------------------------------------------------------------
"Nature" on government cuts to research       http://snipurl.com/4erid
"Nature" e i tagli del governo alla ricerca   http://snipurl.com/4erko

Reply via email to