On Thu, 26 Mar 2009, haman...@t-online.de wrote:

John Hardin wrote:

If 3.2.x does indeed implement multiline rawbody matches, then we'll be able to have a robust rule for this - e.g. an HTML email with a table that has more than 30 columns and more than 5 rows. That will be difficult to obfuscate.

Hi John,
by the time the detection is ready, you will get the entire message as ASCII 
art inside a
<pre> or individual letters as ascii art, making up a table with one cell for 
each letter,

That should be good bayes fodder.

or the same pattern made up of <IMG src=red.gif> <IMG src=white.gif> without a 
table

That too is unusual enough to be a good spam sign. There are also existing rules for high image-to-text ratios.

In the long run we will render html to an image and then OCR it to detect the message :)

Heh. Yeah. That's been proposed before, too.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  A well educated Electorate, being necessary to the liberty of a
  free State, the Right of the People to Keep and Read Books,
  shall not be infringed.
-----------------------------------------------------------------------
 64 days since Obama's inauguration and still no unicorn!

Reply via email to