On Thu, 3 Jun 2021, KADAM, SIDDHESH wrote:

Hello Folks,

Is there any possible way using we can scan for the content of an attachment ie 
.doc/pdf/.xls/ppt etc...

Planning is to have a DLP kind of protection with the help of Spamassassin.
Regards,
Siddhesh

spamassassin really isn't the best tool for this job. It's really designed for looking at text stuff, and how do you squeeze the text out of a ppt or xls in a meaningful way? Even more limiting, spamassassin is designed for small to medium size messages, scanning anything over 500KB or so is going to be a resource hog.

What would be better is a tool that is already designed for scanning .doc / pdf/ .xls/ ppt etc.; an anti-virus program with custom rules for the kinds of info you want to detect.

ClamAV has builtin DLP rules for standard kinds of PII (EG CC#s, SSNs, etc) and comes with tools to help you craft custom rules if you have particular kinds of info you need DLP for.

Start with a mail scanning framework (EG amavis or mimedefang) and plug in spamassassin for spam and two instances of ClamAV, one with standard anti-virus rulesets and another with your DLP rules. Then you can use the framework to take what ever kinds of actions you want based on what components 'fired'.




--
Dave Funk                               University of Iowa
<dbfunk (at) engineering.uiowa.edu>     College of Engineering
319/335-5751   FAX: 319/384-0549        1256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin         Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to