On Wed, 23 Feb 2011 22:17:47 -0500 Alex <mysqlstud...@gmail.com> wrote:
> While some of the mail from that sender seems legitimate, other mail > clearly isn't, but it has the same header as a legitimate mail, making > it very difficult to properly train bayes or otherwise accurately > determine that it's indeed spam and it should be discarded. I wouldn't obsess over it. Bayes is pretty good at picking out the relevant markers of messages and ignoring irrelevant parts. Train the spam as spam and the non-spam as non-spam and Bayes should eventually figure it out. [That's my experience, at any rate. However, we use our own Bayes implementation that works a little differently from the built-in SA version, so maybe SA will behave differently...] Regards, David.