Fletcher Mattox wrote: > Hi, > > After years of stability, my bayes db is doing poorly. When I first > noticed it, it was classifying lots of ham BAYES_99, I cleared the db > and started over. Now it finds *very* few ham. > > 0.000 0 3 0 non-token data: bayes db version > 0.000 0 14779 0 non-token data: nspam > 0.000 0 86 0 non-token data: nham > 0.000 0 231925 0 non-token data: ntokens > 0.000 0 1177142672 0 non-token data: oldest atime > 0.000 0 1179789654 0 non-token data: newest atime > 0.000 0 1179789681 0 non-token data: last journal sync > atime > 0.000 0 1179761284 0 non-token data: last expiry atime > 0.000 0 43200 0 non-token data: last expire atime > delta > 0.000 0 90881 0 non-token data: last expire reduction > count > > I've seen people report large spam/ham ratios on this list, but this > seems extreme, >170:1. So I added about 500 ham (I am sure of the > quality) to the db with "sa-learn --ham", hoping that would help. > But it is still behaving poorly, over 20% of my ham is BAYES_99. > (Normally less the 1% of my ham is BAYES_99.) > > Does anyone know why my system can't find any ham? It's a fairly typical > university site of about 10000 messages/day with a 50/50 ham/spam ratio, > so I know it is receiving plenty of ham. Running 3.2.0 if it matters.
1) Does you MTA (mail server) use DNSBL lists to block spam? Which lists does it use? [abuse sources, DUL] 2) Do you use greylisting? [in combination with CBL.abuseat.org or a list containing it] Spamassassin is an effective but costly tool for spam defense. It should be used as *the second* line of spam defenses after deploying less effective but much less costly defenses such as DNSBL lookups at MTA level. Such deployment scheme should reduce spam/ham ratio seen by spamassassin. -- [pl>en: Andrew] Andrzej Adam Filip : [EMAIL PROTECTED] : [EMAIL PROTECTED] Home site: http://anfi.homeunix.net/