Duncan, >One other problem is that the GA currently (IIRC) doesn't process the >messages, just the tests hit. Of course, now, the test are different from >those 2 versions ago, messing up the GA.
Replacing the message by the result of the test would be pretty simple I beleive. X-Spam-Status: No, hits=-2.0 required=5.0 tests=IN_REP_TO version=2.01 The X-Spam_Status kind of gives the results already. Checking carefully for false positive would be much an issue tho. Now I may be wrong, but how new tests can be introduced if they are not accounted by the GA to get some weight? >Furthermore, everyone has a different idea of what spam is. Is commercial >e-mail, that was sent by a company who legitimately has your e-mail address, >spam? That is exactely why I would like to have my own corpus. >I imagine that the size of the corpus is not as important as the variety of >messages, its currentness, and the accuracy of its filing. As I don't know how to claim variety, currentness and accuracy, one good way to do it is to monitor incoming emails until I have accumulated X different messages. X being big enough so I am sure I cover all situations. Olivier _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk