John Rudd wrote:
Kenneth Porter wrote:
--On Friday, October 13, 2006 9:23 AM +0100 Justin Mason
<[EMAIL PROTECTED]> wrote:
Please bear in mind, also, that there are 5 different rules that
use RFCI data, and they have wildly varying accuracies and scores:
SPAM% HAM% S/O RANK SCORE NAME
3.7247 0.0540 0.986 0.85 2.60 DNS_FROM_RFC_DSN
2.2447 0.1700 0.930 0.73 1.94
DNS_FROM_RFC_BOGUSMX 15.1533 4.6068 0.767 0.51
1.45 DNS_FROM_RFC_POST 18.6219 8.6003 0.684 0.49
1.71 DNS_FROM_RFC_ABUSE 6.4258 4.0476 0.614
0.48 0.20 DNS_FROM_RFC_WHOIS
DNS_FROM_RFC_DSN fires on 3.7247% of spam, and only 0.054% of ham,
giving
it an accuracy of 98.6%.
OTOH, DNS_FROM_RFC_POST, DNS_FROM_RFC_ABUSE, and DNS_FROM_RFC_WHOIS will
likely not make it into the next release going by those rates.
Rather than remove them, would it make sense to rescore them with a
much lower weight, perhaps in some automated way? Even if the rules
were useless, it might be desirable to give them a "report only" score
(I think 0.001?) for the human who reviews the reports.
Cc'ing to the dev list since I'm raising the issue of changing the
mass-check machinery.
I agree: I would rather see the rules either given a default score of 0,
or something meaninglessly low. (in either case, perhaps with a comment
as to why, so it doesn't seem odd to people who stumble across them)
If it's meaninglessly low, then I can still filter on that in the report
header.
If it's 0 or meaninglessly low, then I can adjust the score for local
use without having to re-create the rule.
Though, now that I think about it, if it was available via an sa-update
channel, that would be a useful alternative. That would be a useful way
to let people opt-in to what might seem to be controversial or similar
rule sets: move them from the standard rules and into an opt-in rule set.
Then I'd just have to learn the right way to deal with sa-update (which
I've been meaning to do, but just haven't).
- Re: Concerned with scores for from rfc-ignorant.org John Rudd
-