Re: Really hard-to-filter spam

David B Funk Thu, 27 Jul 2023 21:27:40 -0700

On Fri, 28 Jul 2023, Jared Hall wrote:

On 7/27/2023 12:08 PM, Ken D'Ambrosio wrote:
Hey, all. I've recently started getting spam that's really hard to dealwith, and I'm open to suggestions as to how to approach it. Superficially,

[snip..]

The damn body's been encoded! And there's so little in there that it's nottriggering on many rules (e.g., Bayesian doesn't go over 20%). If anyonehas a bright idea -- maybe a way to decode the attachments and run a regexagainst _that_? -- I'm all ears.
1. There are milters/content-filters that decode Base64 message parts(amavisd-new, mimedefang, etc) for processing by SA.2. There are still sufficiently unique items: First-Name-Only, Mixed-Caseword in the Subject (NLP modeling), and a Base-64 encoded HTML attachment (w/UTF-8 encoding no less). Combined in a Meta rule, these innocuous items willlikely hit with good accuracy even without Base64 decoding.

Umm, unless I'm really missing something here the usual SA processing decodessuch body stuff (QP, Base64, etc) and feeds the "cleaned" text to the ruleprocessing engine.

You have to work hard to get matches done on the raw stuff if you want to dospecial rule matching on the un-decoded body.



--
Dave Funk                               University of Iowa
<dbfunk (at) engineering.uiowa.edu>     College of Engineering
319/335-5751   FAX: 319/384-0549        1256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin         Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: Really hard-to-filter spam

Reply via email to