I have to change a rule and I want to do it nicely. So suggestions needed. The rule is : SUBJECT_XXX and in it, it has naughty words. One of which it looks for is : /pen.s/i Which was just trying to get past obfuscations. Well, anything that mentions: "Open source" in subject gets tagged as naughty!
Ewww! That's a really annoying FP.
So would this be better: /[^oO]pen.s/i
Or I could easily just do this: /\bpen.s/i But I thought that spammers could use punctuation to get past that.
How about this: spammers aren't going to obfuscate "penis" by writing "pen s", right? They're doing things like "pen!s", "pen1s", and so on, yes? So how about:
/pen\Ss/i
as your regex? Of course, this will still trigger on things like "spends", "appends", "depends", and so on. A quick look through my /usr/lib/aspell/en-only.rws gives the following types of matches on /pen.s/ :
serpents pends pen's ripeness
So, as long as you're targeting English only (which seems reasonable for spam involving such obfuscation), you can probably get away with:
/pen[^ tde']s/i
But it seems a little unsatisfactory, somehow.
--Kai MacTane ---------------------------------------------------------------------- "I looked Death in the face last night,/I saw him in a mirror, And he simply smiled,/He told me not to worry: He told me just to take my time." --Oingo Boingo, "We Close Our Eyes"
------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk