On Tue, 2014-07-08 at 22:41 -0400, David F. Skoll wrote: > On Tue, 08 Jul 2014 21:03:35 -0400, Kevin A. McGrail wrote: > > > So this sounds like you are searching the entire email for this > > string which just sounds inefficient especially if they use some big > > attachments. > > It's not too bad because the regex is simple.
The regex is dead simple, indeed. Even as full rule, it should perform better than most complex body rules, since it is a very short pattern with no backtracking. For the benefit of easier discussion, the regex /\n\nTV[opqr]/ can be translated to "an empty line, that hopefully matches the gap at the beginning of a MIME attachment, with some certain base64 encoded bytes". That last part should better be referred to as a base64 encoded DOS MZ executable -- or simply .exe file. A base64 encoded /^TV[opqr]/ is identical to /^MZ/ decoded. Which of course means, that applies to more than just a new variant of some malware. Though generally targeting any MS executable dropped as a mail attachment is not a bad idea... > The reason I did it with a SpamAssassin rule is that we have ways to > push out SpamAssassin rules easily to our customers, but not so much > code changes. :) > > The rule hits on surprisingly few messages (only two out of a couple of > million so far), but it's not terribly accurate: One false-positive caused > by a stupid base-64 encoder that leaves extra newlines between lines, That should be solvable by adding stricter border to be beginning of the regex. Since I don't know what your samples use in particular, here is a quick first idea. /(: base64|")\n\nTV[opqr]/ The idea is to base the regex at a char not part of the base64 set, focusing on the Content-Transfer-Encoding and Content-Disposition MIME headers. Whatever is the last one used. > and one sort-of-false-positive that was a DLL renamed to .DAT to sneak > past filename extension blocks, but wasn't otherwise malicious. If you deliberately try to sneak past sensible security measures, you should not be surprised to be blocked. The attempt by an honest user to disguise any $file (he did it on purpose, so he knows there's issues with that) is in no way better than a dis-honest user disguising a file. -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}