David Benigni wrote: > Hello, > > I'm having some problems with a rule. I'm filtering based on > particular words (yeah, its not good to do that) and its catching things > that I don't think it should. I can't seem to find the problem. Here > is the rule: > > rawbody BADWORD_RULE_1 /\b(?:xxx|porn)\b/i > describe BADWORD_RULE_1 Unacceptable word or phrase > score BADWORD_RULE_1 0.1 > > The problem is that if I have an email with an attachment, its possible > the xxx part crops up from an encoded file. If I run the email through > perl program with the same regex it doesn't pick it out, but SA seems > to. What SA version are you using?
AFAIK, You'd see this behavior for rawbody rules with 2.6x, but not with 3.x. Workaround: use body instead of rawbody, but this won't match html tags. Add a second uri rule to catch those.