On 12/13, Kevin A. McGrail wrote: > Is there a formal policy for including (or excluding) DNS-based lists
what s > There is formal consensus from the PMC that ^ work^ for most installations > is > adequate for default inclusion once the merits of the BL are shown I wanted to point out the logic behind this. SpamAssassin is heavily dependent on network tests for accuracy. Disabling them results in something like 5 times as many emails mis-classified, both false positives and false negatives (with the default threshold of 5). So the default configuration is set up to need minimal adjustment for small installations likely to have the least ability to do those adjustments, and require more adjustment for very large installations which are inevitably going to need to change stuff anyway. If you can come up with a way to disable all network tests that have a free use limit without crippling spamassassin, please tell us. That would be lovely. And I do think it's appropriate to discuss this in terms of all network tests, not just DNS tests. These are the statistics from... 2009. Would be nice to get newer ones. Why aren't these updated? Set 0, no net, no bayes: http://svn.apache.org/viewvc/spamassassin/trunk/rules/STATISTICS-set0.txt?view=markup # False positives: 238 1.12% # False negatives: 9678 21.93% Set 1, with net, no bayes: http://svn.apache.org/viewvc/spamassassin/trunk/rules/STATISTICS-set1.txt?view=markup # False positives: 30 0.14% # False negatives: 1381 3.13% Without the network tests, it got 7.9x as many spams wrong, and 7.0x as many hams wrong. (Opened bug for those files being out of date, I think the data gets generated daily: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6726 ) -- "I love God. He's so deliciously evil." - Stewie Griffin, Family Guy 2x02 http://www.ChaosReigns.com