Rob McMillin wrote:

> Craig, I'd be curious to see this corpus -- where can I find it? I'd 
> like to know, once and for all, how badly this kills the non-spam. Also, 
> is there a testbed suite for checking the results against an arbitrary 
> corpus?

The stuff in the /masses directory of the distro is used for checking corpuses.  
It's reasonably well documented, and the parts that aren't are reasonably 
readable perl code.  The spam corpus is kept semi-secret because it contains the 
email addresses of a number of spamtraps.  There is no non-spam corpus, though a 
number of people have been contributing the output of running mass-check on 
their own mailboxes.  Most people for obvious reasons don't want to contribute 
their own mail for a nonspam corpus, but are happy to contribute mass-check 
output.  Let me know if you're dying to get the spam corpus, and promise me you 
won't leak it to any spammers, and I'll hook you up.


Spamassassin-talk mailing list

Reply via email to