Matt Kettler a écrit :
Sébastien AVELINE wrote:
Hello,
You will find my top rules fired with spamassassin.
I have spamassassin on several boxes, each have his own bayes_db
files, I use razor, dcc_check, uribl, bayes .... We have hundreds of
thousand messages per day.
In my top rules for spam you will see a lot of "collaborative rules"
like razor,uribl,dcc_check. I wonder why there isn't more heuristic
and bayesian rules in my top. Do you think that my stats seem to be
"normal" or is there something wrong ? Any suggestions are welcome.
It's really absurd that RDNS_NONE is firing off on 99.6% of email.
Do you not have RDNS for your own network, or is it generating invalid
Recieved: headers?
Ahh, yeah, it looks like your own network lacks RDNS:
Received: from unknown (HELO ?192.168.0.213?)
([EMAIL PROTECTED]@82.235.12.159) by smtpp.alinto.net with SMTP; Thu,
24 Jan 2008 09:30:20 +0000
If you've got a local nameserver, you might want to generate an
in-addr.arpa zone for the 192.168.0.* network to fix that.
As for the bayes, that doesn't surprise me. There's 10 different bayes
rules, and while I'd expect that collectively they add up to most of
your mail, it's not surprising that they're not individually scoring
high. It's a little surprising BAYES_50 is doing so well compared to
BAYES_99.. with the chi-squared combining I'd expect BAYES_99 to edge
it out slightly. Are you doing any manual training? what's your
"sa-learn --dump magic" look like?
Local address is from my office where I submit my mail to my
mailservers. I think RDNS_NONE isn't the main worry. Unfortunately I
don't use sa-learn to feed my bayes, I rely on high number of mails that
come into my servers.
Is it really efficient to train the bayes manualy ?
Here you can see the result from sa-learn --dump magic:
0.000 0 3 0 non-token data: bayes db version
0.000 0 3803618 0 non-token data: nspam
0.000 0 862246 0 non-token data: nham
0.000 0 496111 0 non-token data: ntokens
0.000 0 1181735997 0 non-token data: oldest atime
0.000 0 1198170104 0 non-token data: newest atime
0.000 0 1181805393 0 non-token data: last journal
sync atime
0.000 0 1181779437 0 non-token data: last expiry atime
0.000 0 43200 0 non-token data: last expire
atime delta
0.000 0 476160 0 non-token data: last expire
reduction count