I have been testing the HTML obfuscation with the pattern match for the junk
within the tags ranging from 1 to 5.

  full  MY_FULL_OBFU_HTML  /[\s>]\w+<[\w\s\/\$&;]{1,6}>\w+/

This is the results of my testing.

  {1} have not noticed false positives
  {2} false positives with <br>
  {3} false positives with <sup></sup>
  {4} false positives with <font></font>
  {5} have not noticed false positives
  {6} false positives with <center></center>

This is not to say that either {1} or {5} do not produce false positives but
that I have not noticed them.

The percentage of false positives have not been great.  They are consistent
with certain messages.  For example, a Travelocity notification will always
trigger on {3}.  The worst of all the above is {2}.  White lists can help
avoid most of the false positives.

To try to curb the FPs for tests within the {1,5} range, I will experiment
with the following rule:

  full  MY_FULL_OBFU_HTML  /([\s>]\w+<[\w\s\/\$&;]{1,6}>\w+){2,}/

Please let me know if there is a better way to write this rule.


This SF.net email is sponsored by: The SF.net Donation Program.
Do you like what SourceForge.net is doing for the Open
Source Community?  Make a contribution, and help us add new
features and functionality. Click here: http://sourceforge.net/donate/
Spamassassin-talk mailing list

Reply via email to