On Wed, 2014-07-23 at 11:45 -0600, Amir 'CG' Caspi wrote: > I'm definitely considering writing a rule to catch �[0-9]{3}; > patterns. I'm definitely worried it could cause FPs, but are there > common circumstances where legitimate emails would include dozens to > hundreds of these? (The latest FNs only include a few dozen, not the > hundreds seen in the spample above.) > This works for me:
describe MG_HEX_HTML Body contains too many HTML hex encodings body MG_HEX_HTML /(.{0,3}\&\#x[0-9A-F]{4};){5}/ score MG_HEX_HTML 3.5 It is also used in a meta, along with some other simple local rules, to give hex-bearing spam an extra kick up the rear. I found that, in my mailstream anyway, there was generally not much else to write rules against, hence the high score. Spam arriving here gets quarantined: I look at the sender and subject as a matter of course and, if it looks like a possible FP, I'll look at the text too (I wrote a PHP viewer for quarantined spam a long time ago) but it appears that, after the brief squall of hex spam which made me write the rule, the promised spamstorm ended and so far has failed to restart. Martin