I don't know if any of you folks have read bug 62: http://bugzilla.spamassassin.org/show_bug.cgi?id=62
but I have an idea that I'd like to sound out to see whether people would be interested in (a) participating in, (b) if someone else is doing something like this, and (c) if you can see any privacy issues that might cause trouble. Currently, SpamAssassin has a private corpus of spam that the code is tested against (I assume). But the problem with this is that it is necessarily private, it's relatively static (I presume), and reflects a presumably small subset of spam. (I'm assuming there are honeypot addresses in there, but you can't catch everything!) So, what I was wondering was this -- what if we could collect general stats on pass/fail rates for the extant tests? That is, we add an option to SpamAssassin (disabled by default) that builds a daily summary of pass/fail statistics for each rule. This would be centrally collated -- also daily -- for the purpose of determining in real time how well each spam detection rule is doing. The first purpose would be to rank tests in most-to-least likely to succeed order. As a future enhancement, it would allow near-real-time feedback on which rules are generating false positives or negatives, and so modify the rank of each test. What do you think? -- http://www.pricegrabber.com | Dog is my co-pilot. _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk