At 10/29/03 10:53 AM , Chris Santerre wrote:
I have to change a rule and I want to do it nicely. So suggestions needed.
The rule is :
SUBJECT_XXX
and in it, it has naughty words. One of which it looks for is :
/pen.s/i
Which was just trying to get past obfuscations. Well, anything that
mentions:
"Open source" in subject gets tagged as naughty!

Ewww! That's a really annoying FP.


So would this be better:
/[^oO]pen.s/i

Or I could easily just do this:
/\bpen.s/i
But I thought that spammers could use punctuation to get past that.

How about this: spammers aren't going to obfuscate "penis" by writing "pen s", right? They're doing things like "pen!s", "pen1s", and so on, yes? So how about:


/pen\Ss/i

as your regex? Of course, this will still trigger on things like "spends", "appends", "depends", and so on. A quick look through my /usr/lib/aspell/en-only.rws gives the following types of matches on /pen.s/ :

serpents
pends
pen's
ripeness

So, as long as you're targeting English only (which seems reasonable for spam involving such obfuscation), you can probably get away with:

/pen[^ tde']s/i

But it seems a little unsatisfactory, somehow.

                                                --Kai MacTane
----------------------------------------------------------------------
"I looked Death in the face last night,/I saw him in a mirror,
 And he simply smiled,/He told me not to worry:
 He told me just to take my time."
                                                --Oingo Boingo,
                                                 "We Close Our Eyes"



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?   SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to