-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
PieterB writes: >http://au.spamassassin.org/hacking.html lists how to submit >mass-check results. I have a couple of questions: > >* The CORPUS_POLICY lists that you should use hand-verified spam/ham > tiles, but the CORPUS_SUBMIT lists that you should only check the > top 20 spam/ham messages. I'm pretty sure my corpus is quite good, > but I don't want to check every message by hand. Can anybody > elaborate on this policy? You pretty much *need* to check every message by hand -- to a degree. Otherwise SpamAssassin will be trained against unreliable data, which is worse than no training at all. However, the "degree" is what's key here -- "by hand" can mean scanning over the list of From/Subject lines and occasionally clicking on one or two to verify that they are spam (or ham). That's not very time-consuming in general. >* I get about 4000 genuine spams per month and have a couple of > mailboxes that I'm sure of only contain ham-mail. I receive both > a lot of English and Dutch e-mails. > >* Are there any other contributors already submitting dutch/english > corpora results? Not that I know of... >* Should the corpora be approx. 50% ham and 50% spam? That's an ideal; don't worry about it too much, especially for the nightly rule-QA stuff. >* How many people submit their mass-check results? How many messages > are in their corpora? Right now, we've suspended it due to some moving of the infrastructure that supports it to apache.org. But it should be back up *soon*. Keep an eye on the SpamAssassin-dev list for an announcement when we restart. >Regards, > >Pieter > >BTW: is there a estimated release date set for spamassassin 2.70? not yet ;) - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) Comment: Exmh CVS iD8DBQFABZrcQTcbUG5Y7woRAj3WAJ0WHwyfn+aiJeLzkkHiSn/bbc6YzQCg3reC EFOJnpvbkfFFVuc0L282ebI= =yKIm -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk