> > 1) The documentation says a large corpus is necessary. How
> > large are we talking? 1000? 10,000? 1,000,000 messages?
>
>Hi Philip --
>
>we used about 200,000 last time.
We're getting about 80%/5% true positive / false positive rates right now. We have about 20k in our corpus of email all business (i.e., not personal) accounts. Should I expect to see decent improvements with that size corpus?
> Sounds like the README is out of date; pgapack is what you need.
Thanks. I installed PGAPACK, but now I'm getting this error (I started with just 3 messages just to get it working):
Read test results for 3 messages (3 total).
Read scores for 175 tests.
PGASetMutationProb: Invalid value of mutation_prob: inf
PGAError: Fatal
Any ideas? The docs are pretty sparse (and in some cases inaccurate) in the masses directory of the SpamAssassin install. Has anybody gone through the process and created more detailed docs?
Thanks,
Philip
----------
Philip Tucker
Zix Research Center
Anti-Spam Team Lead
214.370.2068