[ 
https://issues.apache.org/jira/browse/COMDEV-259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin A. McGrail updated COMDEV-259:
------------------------------------
    Description: 
SpamAssassin is a computationally expensive process to run.  It is more costly 
as emails get larger.

So SA has safety valves to prevent excessive usage and DoS by limiting the 
maximum size of emails processed.

Spammers sometimes try and get around this with short messages and large 
attachments to bypass scanning.

There are techniques that exist to truncate emails that are sometimes less 
effective so this task would be to identify better techniques in order such as:

For example:
 * remove attachments for scanning
 * Check size and scan if small enough
 * Truncate the message after closing any open tags
 * scan

 

Apache SpamAssassin is a mail filter to identify spam. It is an intelligent 
email filter which uses a diverse range of tests to identify unsolicited bulk 
email, more commonly known as Spam. These tests are applied to email headers 
and content to classify email using advanced statistical methods. 

In addition, SpamAssassin has a modular architecture that allows other 
technologies to be quickly wielded against spam and is designed for easy 
integration into virtually any email system. 

It is primarily written in Perl with a few bits in C and shell scripts for 
system integration.

The compendium at 
https://raptor.pccc.com/raptor.cgim?template=email_spam_compendium is helpful 
to understand some of the concepts with SpamAssassin

It will be helpful for a student in this project to understand SMTP but a 
willingness to learn and setup your own mail server on a Linux Distribution 
with SpamAssassin for a personal test domain will be very desired with 
assistance provided to get the basic framework for a sandbox for learning.

As email becomes more commodotized by major providers, knowledge of email 
systems and their security is dwindling.  This opportunity can provide 
real-world experience with an email security product that is employed by 
countless commercial systems in the world.

  was:
SpamAssassin is a computationally expensive process to run.  It is more costly 
as emails get larger.

So SA has safety valves to prevent excessive usage and DoS by limiting the 
maximum size of emails processed.

Spammers sometimes try and get around this with short messages and large 
attachments to bypass scanning.

There are techniques that exist to truncate emails that are sometimes less 
effective so this task would be to identify better techniques in order such as:

For example:
 * remove attachments for scanning
 * Check size and scan if small enough
 * Truncate the message after closing any open tags
 * scan


> GSOC 2018 SpamAssassin Size Check Improvements
> ----------------------------------------------
>
>                 Key: COMDEV-259
>                 URL: https://issues.apache.org/jira/browse/COMDEV-259
>             Project: Community Development
>          Issue Type: Project
>          Components: GSoC/Mentoring ideas
>            Reporter: Kevin A. McGrail
>            Priority: Major
>
> SpamAssassin is a computationally expensive process to run.  It is more 
> costly as emails get larger.
> So SA has safety valves to prevent excessive usage and DoS by limiting the 
> maximum size of emails processed.
> Spammers sometimes try and get around this with short messages and large 
> attachments to bypass scanning.
> There are techniques that exist to truncate emails that are sometimes less 
> effective so this task would be to identify better techniques in order such 
> as:
> For example:
>  * remove attachments for scanning
>  * Check size and scan if small enough
>  * Truncate the message after closing any open tags
>  * scan
>  
> Apache SpamAssassin is a mail filter to identify spam. It is an intelligent 
> email filter which uses a diverse range of tests to identify unsolicited bulk 
> email, more commonly known as Spam. These tests are applied to email headers 
> and content to classify email using advanced statistical methods. 
> In addition, SpamAssassin has a modular architecture that allows other 
> technologies to be quickly wielded against spam and is designed for easy 
> integration into virtually any email system. 
> It is primarily written in Perl with a few bits in C and shell scripts for 
> system integration.
> The compendium at 
> https://raptor.pccc.com/raptor.cgim?template=email_spam_compendium is helpful 
> to understand some of the concepts with SpamAssassin
> It will be helpful for a student in this project to understand SMTP but a 
> willingness to learn and setup your own mail server on a Linux Distribution 
> with SpamAssassin for a personal test domain will be very desired with 
> assistance provided to get the basic framework for a sandbox for learning.
> As email becomes more commodotized by major providers, knowledge of email 
> systems and their security is dwindling.  This opportunity can provide 
> real-world experience with an email security product that is employed by 
> countless commercial systems in the world.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@community.apache.org
For additional commands, e-mail: dev-h...@community.apache.org

Reply via email to