This is hopefully easier than I’m thinking it is. We’ve been running without bayes for a very long time and I thought I’d give it a shot again with autolearning to see if it’s helpful. The last time I touched it was 2.6.something and we had spam scanning spread across four servers and I don’t believe bayes was capable of storing tokens in mysql. We’re now down to two boxes, and I’ve configured mysql storage.
In a nutshell, this is what I’ve got: v320.pre: loadplugin Mail::SpamAssassin::Plugin::Bayes v310.pre: loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold local.cf: # Bayes settings bayes_store_module Mail::SpamAssassin::BayesStore::MySQL bayes_sql_dsn DBI:mysql:spamass:10.88.77.x bayes_sql_username spamass bayes_sql_password SECRET use_bayes 1 bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam 0.1 bayes_auto_learn_threshold_spam 12.0 bayes_journal_max_size 102400 bayes_expiry_max_db_size 150000 bayes_auto_expire 1 bayes_learn_to_journal 1 If I look in the mysql db, I see plenty of entries. If I run sa-learn and ask it to dump some info, that works: [root@spam-b /usr/local/etc/mail/spamassassin]# sa-learn --username=sp...@bway.net --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 7 0 non-token data: nspam 0.000 0 243 0 non-token data: nham 0.000 0 56976 0 non-token data: ntokens 0.000 0 1435355510 0 non-token data: oldest atime 0.000 0 1435529521 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count But I never see any bayes rule hits in the headers of my emails. I have in my personal sql prefs the following: add_header ham Bayes-Toks _TOKENSUMMARY_ That’s always empty. So, is bayes working? I clearly am adding tokens and autolearn is happening. But I’m not convinced any bayes-related rules are happening. What’s the procedure to actually test this? Glancing at the sa-learn dump, I’m mighty suspicious of the low number of autolearn spam messages ( 7 0 non-token data: nspam), but I may be mis-reading that. Any pointers to debug further? I feel like something very obvious is amiss here. Thanks, Charles