From: "Kenneth Porter" <[EMAIL PROTECTED]>
--On Friday, October 13, 2006 9:23 AM +0100 Justin Mason <[EMAIL PROTECTED]> wrote:

Please bear in mind, also, that there are 5 different rules that
use RFCI data, and they have wildly varying accuracies and scores:

SPAM%    HAM%    S/O    RANK    SCORE   NAME
3.7247 0.0540 0.986 0.85 2.60 DNS_FROM_RFC_DSN 2.2447 0.1700 0.930 0.73 1.94 DNS_FROM_RFC_BOGUSMX 15.1533 4.6068 0.767 0.51 1.45 DNS_FROM_RFC_POST 18.6219 8.6003 0.684 0.49 1.71 DNS_FROM_RFC_ABUSE 6.4258 4.0476 0.614 0.48 0.20 DNS_FROM_RFC_WHOIS

DNS_FROM_RFC_DSN fires on 3.7247% of spam, and only 0.054% of ham, giving
it an accuracy of 98.6%.

OTOH, DNS_FROM_RFC_POST, DNS_FROM_RFC_ABUSE, and DNS_FROM_RFC_WHOIS will
likely not make it into the next release going by those rates.

Rather than remove them, would it make sense to rescore them with a much lower weight, perhaps in some automated way? Even if the rules were useless, it might be desirable to give them a "report only" score (I think 0.001?) for the human who reviews the reports.

Cc'ing to the dev list since I'm raising the issue of changing the mass-check machinery.

They may be broken for you. For me only RFC_ABUSE is broken. They are
checking it against live data. If the procedure is wrong for RFCi then
the procedure is wrong for everything else, too. If you are experiencing
a special case then develop your own special case scores.

{^_^}

Reply via email to