On Thu, 26 Mar 2009, haman...@t-online.de wrote:
John Hardin wrote:
If 3.2.x does indeed implement multiline rawbody matches, then we'll be
able to have a robust rule for this - e.g. an HTML email with a table
that has more than 30 columns and more than 5 rows. That will be
difficult to obfuscate.
Hi John,
by the time the detection is ready, you will get the entire message as ASCII
art inside a
<pre> or individual letters as ascii art, making up a table with one cell for
each letter,
That should be good bayes fodder.
or the same pattern made up of <IMG src=red.gif> <IMG src=white.gif> without a
table
That too is unusual enough to be a good spam sign. There are also existing
rules for high image-to-text ratios.
In the long run we will render html to an image and then OCR it to
detect the message :)
Heh. Yeah. That's been proposed before, too.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
A well educated Electorate, being necessary to the liberty of a
free State, the Right of the People to Keep and Read Books,
shall not be infringed.
-----------------------------------------------------------------------
64 days since Obama's inauguration and still no unicorn!