I would like to setup a site wide spam filter using SpamAssassin. In addition to using the network rules, I would like to setup something where my users can submit their messages for ham/spam to the system so the Bayesian system can learn.
I read in the Wiki that you can redirect/bounce a message with mail headers intact to a couple of mail boxes (one for ham, one for spam) and run the sa-learn. The following URL in the Wiki (http://wiki.apache.org/spamassassin/ResendingMailWithHeaders) describes how to do it for several mail clients. My site uses Outlook. I did the resend as a test and compared the headers of the original message and the one that's resent. There are what I consider some significant differences in the headers. I realize that certain ones will be different, but I'm not sure if the other ones will make a difference when sa-learn classifies the messages. The headers that are different are as follows: *) Return-Path *) Sender - this doesn't exist in the original message *) Delivered-To - this doesn't exist in the resent message Of course, the headers showing the delivery path are different, but I expect that. So, that being said, here are my questions: What headers does sa-learn care about? Does it take into account the delivery path headers, the Return-Path, the Sender and Delivered-To headers? What happens if there are additional extra headers (X-Authentication-Info, for example) in resent message but not the original? Does this make a difference? Any information you can provide or point me to would be appreciated. Thanks! -Jim