On Tue, 31 Mar 2009, John Hardin wrote:
On Tue, 31 Mar 2009, Lucio Chiappetti wrote:
users MAY forward spam ... to ... a daily crontab ... for sa-learn ...
Do you retain those messages? If not, you have no way to review how SA has
been manually trained.
NO. We did an initial training a couple of years ago, with some 2000 spam
(collected during a few weeks before installation) and some 2000 ham
(collected from the mail archives of a few users). Then we have this in
/etc/mail/spamassassin/local.cf
bayes_auto_learn 1
bayes_learn_to_journal 1
bayes_learn_during_report 1
We do keep for 7 days the quarantined spam (and for this we have
essentially no false positives).
The false negatives are instead passed voluntarily by a few users to the
daily crontab, but are deleted afterwards.
0.000 0 31125 0 non-token data: nspam
0.000 0 239162 0 non-token data: nham
Your bayes is trained with a strong bias towards ham. It should be more the
other way, since the raw volume of email is biased towards spam.
I suggest you also consider either disabling autolearn, or push the
learn-as-ham threshold lower.
I would be glad to do the latter, if I knew where to find such threshold.
There is nothing like that in /etc/mail/spamassassin/local.cf, nor I can
find any doc to configuration parameters on the wiki site.
Would that be one of those two in /usr/share/spamassassin/10_misc.cf ?
bayes_auto_learn_threshold_nonspam 0.1
bayes_auto_learn_threshold_spam 12.0
--
Lucio Chiappetti - INAF/IASF - via Bassini 15 - I-20133 Milano (Italy)
For more info : http://www.iasf-milano.inaf.it/~lucio/personal.html
-----------------------------------------------------------------------
"Nature" on government cuts to research http://snipurl.com/4erid
"Nature" e i tagli del governo alla ricerca http://snipurl.com/4erko