Vivek Khera <[EMAIL PROTECTED]> writes: > I've had exactly 1 message in three years for which this missing > boundary was not a SPAM (which seems to be some mailing list software > that did that botching on an attachment). There doesn't seem to be a > correlation with X-Mailer header, Does. > > either this sound like a good test to add? I'm not sure how one would > add it without modifying the MIME parser to detect it.
Sounds like a good test. I also get that error in VM, so I'd be happy if they were filtered out. I was already playing with another "MIME quality" test, so I tried your suggestion too. Test set: 1789 total messages, 465 of those are spam. test description MIME_SUSPECT_NAME MIME filename does not match MIME content type MIME_MISSING_BOUNDARY missing final boundary test matches spam not-spam MIME_SUSPECT_NAME 8 7 1 (reply to a virus email) MIME_MISSING_BOUNDARY 36 36 0 That's pretty good. Interestingly, almost all of the matched messages also match several other rules: NO_REAL_NAME: 35 of 36 MIME_ODD_CASE: 34 of 36 BASE64_ENC_TEXT: 34 of 36 The NO_REAL_NAME is not too meaningful since it matches 423 messages with only 244 being spam, but MIME_ODD_CASE matches 65 messages with 65 spam, so it almost feels like a small number of spam software packages generating these particular spam messages. I think the false positive for MIME_SUSPECT_NAME would be eliminated with better MIME parsing code than my own. Incidentally, "rawbody" seems to be misnamed since you don't seem to get the raw body (meaning, 100% uncooked). I had to use get_body() to get the original unmodified body. Another idea: what about a negative score for emails containing RFC 934 encapsulated messages? $ egrep -hi '^--* end.* -*-$' *[0-9] | count 7 ------- End of forwarded message ------- 8 ------- end ------- $ egrep -hi '^--* start.* -*-$' *[0-9] | count 7 ------- Start of forwarded message ------- 8 ------- start of forwarded message (RFC 934 encapsulation) ------- Maybe not very common these days, but could be useful for digested mailing lists. Dan _______________________________________________________________ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk