Hi, I am running SA in a hosted environment where SA is the MX and it scans the mails and forwards to real mail server. We have report spam facility where users report spam that went through SA. I am not using bayes as of now but want to start using. To train bayes we have enough spam (via user's spam reporting) but not much ham.
My problem is how to get enough ham for SA training in such an environment? What is a good ratio for ham/spam when training SA? Any other best practices that I can use in such an environment? raj