Hello Peter, Friday, February 11, 2005, 4:17:33 AM, you wrote:
PM> but would that not mean that the bayes filter will learn the headers PM> that spam assassin adds as spam .. and then after a while only start PM> classing mail that already has the spam headers as bayes_99 ? No, since the sa-learn process knows to ignore the SpamAssassin headers. I feed EVERYTHING into Bayes, once it's been manually verified/classified. Bob Menschel PM> I really do not know, I am just asking. PM> Peter PM> Matt Kettler wrote: >> At 05:06 PM 2/10/2005, Matias Lopez Bergero wrote: >> >>> Just a question, >>> It is worth to train the bayes filter with messages already detected >>> and flagged as spam by spamassassin? That would do any good? >> >> >> Yes. And even if they are already flagged as BAYES_99 it is still >> worthwhile. >> >> >> The reason why is that bayes does not learn that a message is spam or >> not. Bayes learns that a given set of words and tokens were seen in >> spam. A given spam message might be scored as spam and might already >> score high on the bayes scale, but it can still contain valuable new >> words to learn from. In particular the constant mutations of ways of >> spelling drug names provides a constant stream of fresh new spam >> indicators to for bayes learn about. Learning about these helps it >> identify future spam messages that might not otherwise look very >> spam-like, and offers you some protection from false negatives caused by >> spam mutations. >> >> >> The only time it's not worthwhile is if the message was already learned >> as spam (ie: by the autolearner).. but in that case SA will just ignore >> you. You're wasting some cpu time, but you won't damage or corrupt >> anything.