Okay, I think I was mistaken about lack of base64 decoding being the 
problem.

Does SpamAssassin recurse over subparts in multipart messages?  It 
appears that for multipart messages of the form:

Headers (Content Type: multipart/alternative;)
 |
 ---Part 1 (plain text)
 (blank)
 |
 ---Part 2 (base64 encoded HTML)
 (encoded content)

Part 2 is not being processed as part of the message body.

Am I missing something?

On 28 Jan 2002 at 16:09, Nels Lindquist wrote:

> I'm new to the SpamAssassin world, so I hope this issue hasn't been 
> discussed to death already.
> 
> SpamAssassin first come to my attention on the MIMEDefang mailing 
> list when SA integration was added to 'Fang.  I've been playing with 
> it for a little while now, and I've been very pleased with the 
> results.
> 
> However, I noticed that a few messages which were clearly spam were 
> sneaking through just under the threshold.  It wasn't immediately 
> obvious why this should be since the messages had all the hallmarks 
> of blatant spam which typically gets quite a high score.
> 
> I determined that the messages in question had their bodies base64 
> encoded.  It would appear that SpamAssassin isn't decoding the base64 
> strings prior to doing its analysis.
> 
> Since I'd been in the habit of manually decoding such messages so 
> that I could report them via SpamCop, I did some testing.
> 
> I took one of the messages which had scored 4.2 (threshold is the 
> default 5.0), manually decoded the base64 string, and built a message 
> using the same headers and the decoded message body.
> 
> Using the commandline version of SA 2.01, I checked the decoded 
> message and the score jumped to 19.1.  Checking with SA 1.5 on a 
> different machine, the score was 25.
> 
> I had similar results with other messages.  Some of them weren't as 
> dramatic, but nearly all of the scores were bumped over the spam 
> threshold due to hits from the decoded message body.
> 
> Has anyone considered adding base64 decoding functionality to 
> SpamAssassin in order to improve its accuracy on such messages?
----
Nels Lindquist <*>
Information Systems Manager
Morningstar Air Express Inc.


_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to