On Wed, 2014-07-23 at 11:45 -0600, Amir 'CG' Caspi wrote:

> I'm definitely considering writing a rule to catch &#x0[0-9]{3}; 
> patterns.  I'm definitely worried it could cause FPs, but are there 
> common circumstances where legitimate emails would include dozens to 
> hundreds of these?  (The latest FNs only include a few dozen, not the 
> hundreds seen in the spample above.)
> 
This works for me:

describe MG_HEX_HTML  Body contains too many HTML hex encodings
body     MG_HEX_HTML  /(.{0,3}\&\#x[0-9A-F]{4};){5}/
score    MG_HEX_HTML  3.5

It is also used in a meta, along with some other simple local rules, to
give hex-bearing spam an extra kick up the rear. I found that, in my
mailstream anyway, there was generally not much else to write rules
against, hence the high score. Spam arriving here gets quarantined: I
look at the sender and subject as a matter of course and, if it looks
like a possible FP, I'll look at the text too (I wrote a PHP viewer for
quarantined spam a long time ago) but it appears that, after the brief
squall of hex spam which made me write the rule, the promised spamstorm
ended and so far has failed to restart.
  

Martin




Reply via email to