Hi: A lot of mail has shown up in the group debating the soundness of Habeas's watermarking scheme. Whether that debate is on topic, I'll leave as an exercise for others. For the record, I think Habeas's idea is sound enough, provided they follow through with it. But this is not what concerns me.
What does concern me is how SpamAssassin should deal with Habeas marks, which clearly *is* on-topic. Specifically, should SpamAssassin auto-learn Habeas-marked messages as ham, as it does today? In an earlier thread, Theo said it should: > Well, this is less a question of "should it be autolearned" and more > of a "how good is the Habeas system"... In the perfect world, it's > not forgable/misused and you would always accept it as a sign of ham, > and therefore autolearning is desired. > > Since we don't live in the perfect world, the question is: can the > Habeas folks act fast/complete enough so that forging/misusing the mark > is completely minimized? If they can, then there's not a huge issue -- > yeah, some spam will get through, but they'll quickly be squashed and > there you go. If they can't, then their whole business plan fails as > people start ignoring the mark, and again no problem since the SA rules > would go away. I disagree, I think it is still a question of, "should it be autolearned?" I think auto-learning habeas-marked emails as ham represents an exploitable vulnerability in SpamAssassin: spammers can send a large amount of habeas-marked spam (maybe not even real spam that actually sells something, maybe just email with a large amount of spammy words/phrases like "[EMAIL PROTECTED]", etc) from untraceable throwaway accounts. This spam gets auto-learned as ham due to the habeas mark. The spammers can now send real, traceable spam WITHOUT including the habeas mark, and it will past SA's checks because now bayes thinks it is ham. We have already seen the effect of this vulnerability in action over the past two days. I do agree the Habeas folks will need to act quickly and completely so the effect of forgeries is minimized. However, this doesn't mean SpamAssassin needs to be a sitting duck for such forgeries. I think if you just stop bayes from auto-learning habeas-marked mail as ham, you'd take away the vulnerability, and the downside would be almost nil. Consider: With the current scoring, If an email has a habeas mark on it, it doesn't really need to be added to the bayes database since the habeas mark will always pull down the score low enough to mark it as ham (except for the most extreme cases). So we don't really need to add those particular messages to the ham database anyway (excellent ham examples they may be). On the flipside, the negative effect from auto-learning forged habeas mail as ham is huge. From my perspective, I'd be willing to live with the FNs from forged habeas marks themselves if it wouldn't mess up my bayes. As it is, I have to change my habeas scoring to hit at 0.0 to avoid this. Anyway, what do others think about this? DaC ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk