I don't know if any of you folks have read bug 62:

http://bugzilla.spamassassin.org/show_bug.cgi?id=62

but I have an idea that I'd like to sound out to see whether people 
would be interested in (a) participating in, (b) if someone else is 
doing something like this, and (c) if you can see any privacy issues 
that might cause trouble. Currently, SpamAssassin has a private corpus 
of spam that the code is tested against (I assume). But the problem with 
this is that it is necessarily private, it's relatively static (I 
presume), and reflects a presumably small subset of spam. (I'm assuming 
there are honeypot addresses in there, but you can't catch everything!) 
So, what I was wondering was this -- what if we could collect general 
stats on pass/fail rates for the extant tests? That is, we add an option 
to SpamAssassin (disabled by default) that builds a daily summary of 
pass/fail statistics for each rule. This would be centrally collated -- 
also daily -- for the purpose of determining in real time how well each 
spam detection rule is doing. The first purpose would be to rank tests 
in most-to-least likely to succeed order. As a future enhancement, it 
would allow near-real-time feedback on which rules are generating false 
positives or negatives, and so modify the rank of each test. What do you 
think?

-- 
          http://www.pricegrabber.com | Dog is my co-pilot.

                                   



_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to