> It had nothing in the body. Without seeing that relay before, both > BAYES_80 and UNIQUE_WORDS caught it. > > Excluding the attachment encoding itself, here's what it had: > > Received: from [83.76.165.174] (HELO lmnht) > by mail.rudd.cc (CommuniGate Pro SMTP 5.1.4 _community_) > with SMTP id 1081873 for [EMAIL PROTECTED]; Wed, 27 Jun 2007 05:11:47 > -0700 > Received-SPF: none > receiver=mail.rudd.cc; client-ip=83.76.165.174; > [EMAIL PROTECTED] > Received: from [33.31.118.54] (helo=iyaty) > by lmnht with smtp (Exim 4.66 (FreeBSD)) > id 1I4j0S-0003Q8-5s; Wed, 27 Jun 2007 14:12:06 +0200 > Message-ID: <[EMAIL PROTECTED]> > Date: Wed, 27 Jun 2007 14:11:19 +0200 > From: Annabel Cleveland <[EMAIL PROTECTED]> > User-Agent: Thunderbird 1.5.0.12 (Windows/20070509) > MIME-Version: 1.0 > To: [EMAIL PROTECTED] > Subject: Re: Cheque.22.pdf > Content-Type: multipart/mixed; > boundary="------------040808030703010202050005" > > --------------040808030703010202050005 > Content-Type: text/plain; charset=windows-1252; format=flowed > Content-Transfer-Encoding: 7bit > > > > --------------040808030703010202050005 > Content-Type: application/pdf; > name="Cheque.22.pdf" > Content-Transfer-Encoding: base64 > Content-Disposition: inline; > filename="Cheque.22.pdf" > > [attachment data omitted] > --------------040808030703010202050005-- >
I'm going to guess that UNIQUE_WORDS hit on the MIME definitions (the ratio is pretty bad actually as there's only 4? non-unique). As for BAYES, the score it would assign would depend entirely on how much ham with attachments (for all the common MIME stuff) and ham with PDF attachments (for the content-type) is trained vs spam of the same. In our case it could be a lot, yours I don't know about obviously. If I were to set up per-user BAYES for my accounts I'm sure BAYES would work perfectly at catching this stuff as I don't receive any ham with PDF attachments except once or twice a year, and it'd be working mostly off the content-type.