[SAtalk] Re: dictionary words in ascii part of mime

Bob Proulx Fri, 09 Jan 2004 01:56:20 -0800

Alex Stade wrote:
> I run SpamAssassin 2.61 and it catches a lot of spam, but lately, there is 
> spam getting through that has bare dictionary words in the ASCII part of a 
> MIME message and all the usual junk in the multimedia part. When reading 
> these e-mails in Outlook or something like that, the client renders the 
> messages beautifully and displays all the HTML and executes all the
> arbitrary code that comes with it.


It is called bayes poison.  This is starting to be very common in
spam.

By default Outlook prefers the HTML mail to plain text.  By default
text mailers (such as my favorite, mutt) prefer the plain text.  So I
only see the random garbage and not the html.  Although some spammers
are starting to get literate and include excerpts from novels!  :-)

> The amount of text is varying, but it appears difficult to train a bayes 
> database to distinguish these as bad words, yet retain them as good words. 

That is exactly the purpose of the bayes poison.  It is intending to
get in the way of Bayesian analysis.  Be assured that this is a hot
topic of discussion and that the developers are well aware of the
problem and working on counter measures.

> So the question finally, is, how do I protect against this type of spam?

For me personally SA is still tagging the spam at a very good rate.  I
am only seeing these types of spams in my caughtspam folder.  But I am
also very agressive with rejecting as much spam as possible at the MTA
level.  And I am really only seeing them because I am poking at the
remains and examining them.  Are the non-bayes rules really doing
poorly against these messages for you?  Which they may be, spammers
are prescreening against SA.  And that is what the Bayesian inference
engine is designed to do, to create a custom rule for you and no one
else that the spammers would not be able to avoid.  Except if the
bayes poison is working then we have to switch to the next Plan B.

Bob

pgp00000.pgp
Description: PGP signature

[SAtalk] Re: dictionary words in ascii part of mime

Reply via email to