On Fri, 2013-10-25 at 19:12 -0400, Alex wrote: > I've created a bunch of rules that are intended to detect short body's > meta'd with a missing subject. I thought it was working okay, but I > think I should have an exclusion for messages that contain a > significant attachment.
Assuming a loose interpretation of "significant" attachment as any image, these should help. Easy to include more (specific) content types. See the MIMEHeader plugin. mimeheader __MIME_IMAGE Content-Type =~ /^image\/./ mimeheader __MIME_ATTACH Content-Disposition =~ /^attachment/ If by significant you mean the size (dimensions) of an image (as in no tiny stupid logos or smiling yellow blobs), the ImageInfo plugin is what you want. Documentation in the pm file, no man page. > I'd appreciate it if someone could help me review my rules and show me > where they're going wrong. Some of it is adapted from John's work back > in April, I think. > > rawbody __RB_LE_200 /^.{2,200}$/s > tflags __RB_LE_200 multiple maxhits=2 I understand this on first sight weird stuff is designed to match a (raw)body with <= 200 chars, and prevent FPing on just slightly exceeding the chunk size, no? > body __RB_GT_200 /^.{201}/s > meta __BODY_LE_200 (__RB_LE_200 == 1) && !__RB_GT_200 However, since the chunk size is 1-2 kB, __RB_LE_200 cannot match more than once. Even worse, it may match the last chunk with a total size more than 200 byte. The last constraint in the meta prevents this FP, not the 'equals 1' test. The sub __RB_GT_200 appears to be intended as a rawbody rule, not body. Either way, the entirety of these rules is much too complicated. A test for "more than" is easy and cheap. Generally as shown above. An accompanying test for "less than or equal" the same amount... Is its negative. meta __RB_LE_200 !__RB_GT_200 # less or equal IFF not greater > meta LOC_SHORT (__BODY_LE_200 && __HAS_HTTP_URI && (!(BAYES_00 || > USER_IN_WHITELIST || KHOP_RCVD_TRUST))) > describe LOC_SHORT Has URI and short body > score LOC_SHORT 1.1 -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}