Hi Keith,

Thanks for the reply!

> -----Original Message-----
> From: Keith C. Ivey
> Sent: Monday, October 13, 2003 11:31 PM
> To: [EMAIL PROTECTED]
> Subject: RE: [SAtalk] More HTML Obfuscation: This One Made It Through
> 
> 
> Larry Gilson <[EMAIL PROTECTED]> wrote:
> 
> > ### I wrapped the rawbody line to keep the integrity of the 
> ### rule.
> > # Invisible text color in font tag
> > rawbody  MY_RBDY_INVSTXT    
> >    /<font.* color=("?\#?FFFFF[0-9A-F]"?|"?white"?).*>/i
> > describe MY_RBDY_INVSTXT    MY: Invisible text color
> > score    MY_RBDY_INVSTXT    2.0
> 
> That should work.  Ultimately it would be better taken care of 
> with the other FONT rules.  It should really be caught by 
> HTML_FONT_INVISIBLE.

It would be nice if it worked better.  This rule catches many more
occurances than HTML_FONT_INVISIBLE, at least in 2.55.


> > # Obfuscate text by using ISO 8859-1 character set DEC encoding 
> > rawbody  MY_RBDY_OBFU_ISOD 
> > /&\#(6[5-9]|[7-9][0-9]|1[0-1][0-9]|12[0-6])\D/ describe
> > MY_RBDY_OBFU_ISOD  MY: OBFU text with ISO DEC set score   
> > MY_RBDY_OBFU_ISOD  4.0
> > 
> > If you ever get HEX encoding, you can use:
> > # Obfuscate text by using ISO 8859-1 character set HEX encoding 
> > rawbody  MY_RBDY_OBFU_ISOH 
> > /\%(4[1-9]|[5-7][0-9]|[4-6][A-F]|7[A-E])\D/i describe
> > MY_RBDY_OBFU_ISOH  MY: OBFU text with ISO HEX set score   
> > MY_RBDY_OBFU_ISOH  4.0
> 
> You're confusing two things in those two rules.  The difference 
> isn't decimal versus hexadecimal -- it's that the first is HTML 
> escaping and the second is URL escaping.

Thank you for setting me straight here.  I really did not know the
difference, obviously.  I was responding to messages that have the occurance
of encoding via either escape method.


> The second one should mostly already be caught by 
> HTTP_EXCESSIVE_ESCAPES, since URL escaping really only works in 
> URLs.  If you have examples where it's not being caught, I'd 
> like to see them.

I have had occurances where my MY_RBDY_OBFU_ISOH rule hit but
HTTP_EXCESSIVE_ESCAPES did not.  I believe that HTTP_EXCESSIVE_ESCAPES is
just looking for URL(?) escaping with a subset of characters that I check
for.  I have seen URLs using a combination of URL and HTML escaping which
HTTP_EXCESSIVE_ESCAPES does not pick up.


> It is possible to use hexadecimal numbers in HTML escaping, 
> though, and you're not catching that.  For example, 'A' instead 
> of being '&#65;' can be represented as '&#x41;'.  You could 
> combine parts of your two regexes to match those.  Also, you 
> can have leading 0's in the numbers, so '&#65;' can be written 
> as '&#065;' (or '&#0065;' or maybe '&#000000000000000000065;').

Great to know!  I have been Googling to find information so I can better
educate myself.  Your input definitely puts me in the right direction.

Keith, thanks again for your time!

--Larry



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to