Jenny Lee-2 wrote:
>
>
>> Date: Tue, 13 Mar 2012 05:47:03 -0700
>> From: [email protected]
>> To: [email protected]
>> Subject: RE: Help with blocking Chinese Spam
>>
>>
>>
>> Jenny Lee-2 wrote:
>> >
>> > I did turn it on in the .pre. It is also supposed to add a header, but
>> it
>> > does not. How can I check if it is working or not?
>> >
>> > I have:
>> >
>> > ok_locales en
>> > ok_languages en
>> >
>> > Jenny
>> >
>>
>>
>> Add this to your config file:
>>
>> add_header all Language _LANGUAGES_
>
> This adds the header. Thank you.
>
> However, running: spamassassin -D < chinesespam
>
> Does not catch this.
>
> Jenny
>
> Mar 13 17:06:36.294 [27011] dbg: plugin:
> Mail::SpamAssassin::Plugin::TextCat=HASH(0x1d50bc8) implements
> 'extract_metadata', priority 0
> Mar 13 17:06:36.294 [27011] dbg: message: ---- MIME PARSER START ----
> Mar 13 17:06:36.295 [27011] dbg: message: parsing multipart, got boundary:
> ----=_NextPart_000_004F_0181A2CA.182A5CF0
> Mar 13 17:06:36.295 [27011] dbg: message: found part of type
> multipart/alternative, boundary: ----=_NextPart_001_034A_0181A2CA.182A5CF0
> Mar 13 17:06:36.296 [27011] dbg: message: added part, type:
> multipart/alternative
> Mar 13 17:06:36.299 [27011] dbg: message: found part of type
> application/vndms-excel, boundary:
> ----=_NextPart_000_004F_0181A2CA.182A5CF0
> Mar 13 17:06:36.299 [27011] dbg: message: added part, type:
> application/vndms-excel
> Mar 13 17:06:36.299 [27011] dbg: message: parsing multipart, got boundary:
> ----=_NextPart_001_034A_0181A2CA.182A5CF0
> Mar 13 17:06:36.300 [27011] dbg: message: found part of type text/plain,
> boundary: ----=_NextPart_001_034A_0181A2CA.182A5CF0
> Mar 13 17:06:36.300 [27011] dbg: message: added part, type: text/plain
> Mar 13 17:06:36.301 [27011] dbg: message: found part of type text/html,
> boundary: ----=_NextPart_001_034A_0181A2CA.182A5CF0
> Mar 13 17:06:36.301 [27011] dbg: message: added part, type: text/html
> Mar 13 17:06:36.301 [27011] dbg: message: parsing normal part
> Mar 13 17:06:36.302 [27011] dbg: message: parsing normal part
> Mar 13 17:06:36.302 [27011] dbg: message: parsing normal part
> Mar 13 17:06:36.302 [27011] dbg: message: ---- MIME PARSER END ----
> Mar 13 17:06:36.303 [27011] dbg: message: decoding base64
> Mar 13 17:06:36.303 [27011] dbg: message: decoding base64
> Mar 13 17:06:36.310 [27011] dbg: textcat: classifying, skipping: yi sco lv
> is bs sl la ga sa eu et rm cy eo fy gd lt
> Mar 13 17:06:36.328 [27011] dbg: textcat: can't determine language
> uniquely enough
> Mar 13 17:06:36.328 [27011] dbg: textcat: X-Languages: "",
> X-Languages-Length: 671
>
Looks like textcat is not working properly if the message is encoded. For
the mail you posted on pastebin, textcat guessed "ja.shift-jis" which then
triggered UNWANTED_LANGUAGE_BODY.
However, for other chinese spam that got through these days it was either
not able to guess the language or it even guessed "en" as language.
Is this a general problem with SpamAssassin not really able to decode that
sort of mails?
Daniel
--
View this message in context:
http://old.nabble.com/Help-with-blocking-Chinese-Spam-tp33493147p33494200.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.