On Tue, 4 Nov 2003, Colin A. Bartlett wrote:
> I have a rule challenge for you all.
> How can we write a rule to catch messages like the one attached? 

body LOC_BRMASK  
      /<br>(&nbsp;?|.){1,5}<br>(&nbsp;?|.){1,5}<br>(&nbsp;?|.){1,5}<br>/i
describe LOC_BRMASK Masking BR tags with 1-5 characters between
score LOC_BRMASK (see below)

Initially, I'm scoring this one low, just to see what it hits, but I
suspect that to be effective, this one test will have to score very *high*
to compensate for the way this table/br technique successfully hides
blacklist phrases/words.

The example provided had 3 letters/nbsps in each 'cell', I've set the rule
to look for between 1 and 5 characters. I doubt we'll see more than 5, as
this would let blacklist words through. I'm sure we will soon have to
introduce a very complex rule to remove obfuscating HTML (<i></i> pairs,
and other tricks) and still detect the 1-5 character situation.

My only other suggestion would be to the spamassassin developers to create
a chunk of code that scans HTML for tables and has the smarts to
re-assemble a broken table of text into 'lines', before testing for
buzzwords. 

- Charles







-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?   SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to