On Tue, Feb 07, 2006 at 05:36:56PM -0600, Jim C. Nasby wrote: > On Tue, Feb 07, 2006 at 06:17:20PM -0500, Matt Kettler wrote: > > Jim C. Nasby wrote: > > >> Are there any autolearn strings? Are they all "autolearn=no"? are there > > >> any > > >> decent number that are autolearn=failed or autolearn=disabled? > > >> > > > > > > grep -r autolearn caughtspam/ | grep -v 'Binary file' | sed -e > > > 's/.*autolearn=\([^ ]*\).*/\1/'|sort|uniq -c > > > 1545 no > > > 140 spam > > > 4 unavailable > > > > Fair enough, that at least suggests that the autolearner is working. > > However, > > that learning ratio is pretty low. > > > > Are you using network tests? Without DNSBLs it's often hard to get enough > > header > > points to cause spam learning.. > > I believe so... > > grep loadplugin /usr/local/etc/mail/spamassassin/init.pre > # loadplugin Mail::SpamAssassin::Plugin::RelayCountry > loadplugin Mail::SpamAssassin::Plugin::URIDNSBL > loadplugin Mail::SpamAssassin::Plugin::Hashcash > loadplugin Mail::SpamAssassin::Plugin::SPF > > grep -v # ~/.spamassassin/user_prefs | grep -v whitelist > bayes_auto_learn 1 > bayes_auto_learn_threshold_spam 5.0
Hmm... here's something interesting... grep -r autolearn pgsql/ | grep -v 'Binary file' | sed -e 's/.*autolearn=\([^ ]*\).*/\1/' | sort | uniq -c 2010 ham 198 no 17 unavailable So a big chunk of [EMAIL PROTECTED] email is being learned as ham. Looking further, I see... X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.0 ISTM that having the thresholds setup so that BAYES_00 scores low enough to autolearn is a BadThing, as it creates a positive feedback loop. :) I've added bayes_auto_learn_threshold_nonspam -2.6 to my personal config; we'll see if that helps. -- Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"