-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Ricardo,
Wednesday, August 20, 2003, 9:02:12 PM, you wrote: RK> I'm wondering how it is possible to keep bayes "fresh" with RK> both spam and ham, and understand how can SA do RK> auto-learning? RK> How exactly does SA automatically provide ham to bayes? First SA scores email, using whatever tools and rules are specified in your parameters. That's the score you see in the email headers when SA is done. SA then throws away any Bayes score and any Blacklist or Whitelist score, and compares the remaining score against conservatively set thresholds. If your score is higher than the auto-learn spam threshold (15 above the normal spam threshold, I think), then SA says it's so very likely that this is spam, it'll auto-learn it. If the score is lower than the auto-learn ham threshold (I think this is -2), then SA says it's so very likely that this is ham, it'll auto-learn it. SA will not auto-learn anything between these ranges. RK> I imagined sa-learn would have to be fed ham and spam RK> manually, otherwise if it is done automatically, wouldn't it RK> be erroneously counting spam as ham, in the case of RK> false-negatives, and erroneously be taught ham when there RK> are false-positives? Yes, which is why these auto-learn thresholds are so much more conservative than the spam-flagging thresholds. RK> A major concern of mine is that bayes won't get fed ham RK> adequately and then get out of whack. Can anyone explain RK> what affects the efficiency of bayes? If you do nothing (no manual learning other than correcting FPs and FNs), bayes probably won't get out of whack, because of the special auto-learn thresholds. If you go through your spam and manually sa-learn almost all of the spam, then yes, you may get out of balance, unless you also manually sa-learn a corresponding amount of ham. (I wouldn't worry about trying for a 50/50 mix, but if you feed say 300 spam a week into sa-learn, you should probably try to feed at least 30 ham in that same period of time.) Bob Menschel -----BEGIN PGP SIGNATURE----- Version: PGP 8.0 iQA/AwUBP0VvWJebK8E4qh1HEQJ2fACdGbmfYwDr+9sic6QmQwDWxyV6OGEAn25s oqJk3F0ZqAXYxeIcfSuPpYrC =PW3V -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.net email is sponsored by: VM Ware With VMware you can run multiple operating systems on a single machine. WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the same time. Free trial click here:http://www.vmware.com/wl/offer/358/0 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk