On Feb 20, 2014, at 10:34 AM, Axb <axb.li...@gmail.com> wrote: > I hope you're running SA 3.4 so:
I am still on 3.3.2 because nobody has yet packaged 3.4 for CentOS 5.x, from what I can tell. I have the package from the rpmforge-extras repo, and 3.3.2 is still the most current version there (and on Atomic and AtRPMs). I'm not sure who is responsible for updating the packages, but I'll probably have to wait a while until they get 3.4 uploaded there. > Assuming you can check maillogs and can either detect some spammed unknown > user patterns or have a dedicated trap domain to spare, I'd accept that mail > and write some header rules to score the trap rcpt/domain REAL high and use a > rule like > > tflags RULENAME autolearn_force I'm not entirely sure what you mean here. Are you saying to use a honeypot/spamtrap to feed the Bayes DB? My problem is not that my Bayes DB doesn't have enough spam in it, it's that these particular FNs are scoring 00. Let me note that the Bayes DBs are per-user, not per-domain. Here's the magic output from my Bayes DB: 0.000 0 3 0 non-token data: bayes db version 0.000 0 239650 0 non-token data: nspam 0.000 0 85695 0 non-token data: nham 0.000 0 145773 0 non-token data: ntokens 0.000 0 1387110367 0 non-token data: oldest atime 0.000 0 1392917375 0 non-token data: newest atime 0.000 0 1392886526 0 non-token data: last journal sync atime 0.000 0 1392637273 0 non-token data: last expiry atime 0.000 0 5529600 0 non-token data: last expire atime delta 0.000 0 9005 0 non-token data: last expire reduction count I don't think this counts as a "small" DB, does it? Bayes is set to autolearn, and I manually run sa-learn about once a week on my spam folder (to learn the FNs, plus lower-scoring spam that was not autolearned). MANY such image spams are caught properly, including by Bayes; the problem is that some of them, somehow, manage to slip through and score very low (00 or 20). I just have no idea how that is happening (which is why I should start enabling token output in the headers and look), but that's why I was thinking of scoring AC_SPAMMY_URI_PATTERNS very high if Bayes is scoring very low, although I guess that kind of defeats the purpose of Bayes and introduces the risk of FPs. -- Amir