On 12/13, Kevin A. McGrail wrote:
>  Is there a formal policy for including (or excluding) DNS-based lists

                                               what    s
>    There is formal consensus from the PMC that ^ work^ for most installations 
> is
>    adequate for default inclusion once the merits of the BL are shown

I wanted to point out the logic behind this.  SpamAssassin is heavily
dependent on network tests for accuracy.  Disabling them results in
something like 5 times as many emails mis-classified, both false positives
and false negatives (with the default threshold of 5).  

So the default configuration is set up to need minimal adjustment for small
installations likely to have the least ability to do those adjustments, and
require more adjustment for very large installations which are inevitably
going to need to change stuff anyway.

If you can come up with a way to disable all network tests that have a free
use limit without crippling spamassassin, please tell us.  That would be
lovely.

And I do think it's appropriate to discuss this in terms of all network
tests, not just DNS tests.


These are the statistics from... 2009.  Would be nice to get newer ones.
Why aren't these updated?

Set 0, no net, no bayes:
http://svn.apache.org/viewvc/spamassassin/trunk/rules/STATISTICS-set0.txt?view=markup
# False positives:       238  1.12%
# False negatives:      9678  21.93%

Set 1, with net, no bayes:
http://svn.apache.org/viewvc/spamassassin/trunk/rules/STATISTICS-set1.txt?view=markup
# False positives:        30  0.14%
# False negatives:      1381  3.13%


Without the network tests, it got 7.9x as many spams wrong, and 7.0x as
many hams wrong.  


(Opened bug for those files being out of
date, I think the data gets generated daily:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6726 )

-- 
"I love God. He's so deliciously evil." - Stewie Griffin, Family Guy 2x02
http://www.ChaosReigns.com

Reply via email to