--- "John D. Hardin" <[EMAIL PROTECTED]> wrote:

> On Sat, 7 Apr 2007, J. wrote:
> 
> > --- "John D. Hardin" <[EMAIL PROTECTED]> wrote:
> > 
> > > You might want to look at this instead of trying to hand-roll 
> > > obfuscation rules:
> > > 
> > > http://www.impsec.org/~jhardin/antispam/obfusc.pl
> > 
> > Thanks John. I have no idea what the program does but it does seem
> > to catch a lot of the stuff I was going after.
> 
> Basically, given a word list and scores it generates re's to catch
> most simple obfuscations of those words. Theo is right, it largely
> overlaps the ReplaceTags plugin stuff, but I think there are a few
> obfuscations that it catches that ReplaceTags does not (after an
> admittedly brief look at ReplaceTags)...
> 
> > The re is huge so I can't easily figure out what it's doing, but
> > it does miss some of the spam I was targeting with my rule though.
> > for example this one:
> > 
> > http://binaryops.com/spam3.txt
> 
> Yeah, at some point the obfuscation becomes problematic to detect
> with 
> a low rate of false positives, and it is to some degree a game of 
> whack-a-mole.
> 
> However, if the obfuscation becomes complex enough to be difficult to
> automatically detect, it becomes that much more difficult for the 
> victim to be able to *read* and make sense of, so the more esoteric 
> obfuscations become self-limiting.
> 
> > It was mail like that which forced me to use the .{0,4} clauses in
> my
> > rule. I'm probably causing some false positives though especially
> since
> > my scoring is really high.
> 
> Using .{0,4} is far too loose and will cause massive FPs. It's a 
> little better to try to match the specific extreme obfuscation 
> technique, in this case (?:\s[a-z]{2}\s)? (from your sample). Of 
> course, this will probably rot quickly.
> 
> Did you also create a rule for the "from $3, 33" parts? 
> --
>  John Hardin KA7OHZ

Actually the re in the rule was the only thing I could figure out that
actually matched all the spam that was getting through that day. I'm
not sure how common those kinds of mails are now, but I lowered the
scoring a lot in my rule so hopefully it won't cause (m)any fps. I
didn't bother with the $3, 33 part but you're right that it might be a
good way to avoid trouble if I make that part of the re. Here's the
work file I used while making the re:

http://binaryops.com/spamwork.txt



 
____________________________________________________________________________________
Need Mail bonding?
Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users.
http://answers.yahoo.com/dir/?link=list&sid=396546091

Reply via email to