Hi, > On Mon, Jan 19, 2004 at 03:21:06PM -0500, Larry Gilson wrote: > > http://useast.spamassassin.org/doc/Mail_SpamAssassin_Conf.html#learning%20op > > tions > > > > bayes_ignore_header header_name > > ::bangs head on wall:: How did I miss *that*? Thanks for correcting > my careless reading. > > In a broader sense though, shouldn't fields like To: be excluded by > default? It seems like if I receive more than 50% spam, this is a > receipe for disaster. Of course, some spam won't have a valid To: > field, but it seems like constant things like this will be very bad > arbitors.
Although I agree that this Bayes behaviour on To: is good, this thread brought up an interesting problem in me: Does the bayes calculation takes spam:ham ration into account? So, if I have a constant header line (word), present in every spam and every ham message, but i get 10 times more spam than ham (so the counters on this word are 10 times bigger in spam column than in ham column), then bayes will think this word means 10:1 spam probability? Which is bad, of course!! As it does mean nothing, it's equally means it's spam as it's ham. And we all have some constant headers, just think of the Received: line including your mail server name/ip... I wonder if bayes DB normalizes the spam/ham counts, by the number of total spam/ham counters? Then it would find that my word is present in 100% of all spam messages, and 100% of all ham messages, so it means 50% spam probability (instead of 10:1 which means 90%) A'rpi / Astral & ESP-team -- Developer of MPlayer G2, the Movie Framework for all - http://www.MPlayerHQ.hu ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk