Thanks for clarifying Justin! --Larry
> -----Original Message----- > From: [EMAIL PROTECTED] > Sent: Monday, January 19, 2004 11:35 PM > To: Larry Gilson > Cc: 'Ross Vandegrift'; [EMAIL PROTECTED] > Subject: Re: [SAtalk] Bayes mis-learning problem > Larry Gilson writes: > > > In a broader sense though, shouldn't fields like To: be excluded by > > > default? It seems like if I receive more than 50% spam, this is a > > > receipe for disaster. Of course, some spam won't have a valid To: > > > field, but it seems like constant things like this will be very bad > > > arbitors. > > > > That seems like a reasonable assumption in that specific case. However, it > > may not be a good assumption with the large audience that SA serves. Some > > people get more spam than ham, others get more ham than spam, and yet others > > get roughly even amounts. It sounds like your experience is more on the > > extreme upper-end of spam/ham ratio. I would think that at either end of > > the spectrum though, the To: field is not a good indicator of either spam or > > ham regardless of the numbers. > > Actually, it works quite well. Some people get more spam than ham to > specific To addrs, so those become spam signs -- but once a ham arrives > at those addrs, the ham signs outweigh the To spam-sign and redeem > the mail. > > At least, that's how it worked out in our testing; initially we did > not tokenise these headers, but in testing, we found that they did > increase accuracy. > > - --j. ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk