> -----Original Message-----
> From: Robert Menschel [mailto:[EMAIL PROTECTED]
> Sent: Monday, June 30, 2003 9:43 PM
> To: Ralf Guenthner
> Cc: [EMAIL PROTECTED]
> Subject: Re: [SAtalk] Creative spam, any ideas?
> 
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello Ralf,
> 
> Monday, June 30, 2003, 1:37:22 AM, you wrote:
> 
> RG> The spam below slipped through SA 2.54. Note how they substitute
> RG> possible trigger terms with other characters, like a 
> capital "I". Any
> RG> ideas what to do to catch stuff like this? The mail also 
> contained a
> RG> rather graphic image...
> 
> 1) This is where Bayes excels. Feed them to Bayes as 
> confirmed spam; the
> tokens will add up quite quickly.
> 
> 2) I've been collecting these into rules which identify the 
> use of masked
> words, eg:
> body     L_b_MaskedW0rds  
> /L0SE|[EMAIL PROTECTED]|si0n|casin0|0nline|m0re|[EMAIL PROTECTED]|F0r|d0|[EMAIL 
> PROTECTED]|Ple
> [EMAIL PROTECTED]|m0ve|ph
> [EMAIL PROTECTED]|[EMAIL PROTECTED]|[EMAIL PROTECTED]/i
> describe L_b_MaskedW0rds   masked spam word(s)
> score    L_b_MaskedW0rds   0.1
> body     L_b_MaskedW0rds2 
> /WeIcome|Mldget|AnimaI|sieix|E\}\{treme|FlSTlng|Tatt00ed|Iadie
> s|MasslVE|Io
> ads|BlZarre|hardc0re|0bscene|AmaZlNG|SENsatl0NAL|SlCkenlNG/i
> body     L_b_MaskedW0rds3  /\bl[i1]v[e3]
> .{0,9}(?:fuck(?:[i1]ng)?|s[e3]x|nak[e3]d|g[i1]rls?|v[i1]rg[i1]
> ns?|t[e3][e3
> ]ns?|p[0o]rn[0o]?)\b/i
> body     L_b_MaskedW0rds4 
> /\b(excIusive|GiangBiang|sIut|ganigbainged|duides|hairdciore|E
> xcIude|pIz)/
> i
> 
> I have another set which looks for similar items in subject 
> headers. I've
> just begun using these, so my scores are set very low right 
> now, while I
> check for errors in the rules, false positives, etc. Once I'm 
> comfortable
> and confident with them, the scores will be raised to much 
> higher levels.
> 
> On a related topic, which I think I've seen asked before but don't
> remember seeing an answer, which is better (more efficient) within SA:
> single rules with many alternatives such as I have above, or 
> many rules
> with few alternatives? Does one form use less computer resources than
> another?
> 
> Bob Menschel


These are good to try and catch, however I tend to try them last. There are
Soooo many ways to obfuscate a word it is silly. So I only go for a few of
the most popular words and right a rule to try to cath any instance of it
obfuscated. 

Also I've learned now that too many things in a rule will drive you nuts. So
it is better to keep one rule per word. Somthing like MY_OBFU_FREE just
looks for the word Free obfuscated, and so on. 

Here is a list off the top of my head of the ones I look for:
Free
casino
ejaculate
sex
intercourse
penis
adult
girls
sluts
hardcore
movies

Maybe a few more I can't remember. But each has its own rule, with just
about every possibilty of OBFU I could come up with. Chances are if they
have other words OBFU'd, they have these already. So I hit most with just
these famous few. 

HTH
Chris Santerre
System Admin
"You should never, never doubt what nobody is sure about."- Willy Wonka

 


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to