Well, it is not easy to quote properly from hotmail. Excuse my mess up and top posting. Bottom line is... I got rid of this chinese crap. Thank you all for the help SA users. Jenny
--------- > Subject: Re: Help with blocking Chinese Spam > > On Tue, 13 Mar 2012 12:40:16 +0000 > Jenny Lee <bodycar...@live.com> wrote: > > > Will give this a go. What I don't understand is that... Why is this > > not catching this 'utf' which is on the subject? > > You need the :raw tag to see the raw, unencoded header. The meta-rule: > > header __RP_SUBJ_CJK Subject =~ /[\xe4-\xe9]/ > > attempts to limit matches on UTF-8 subjects to Chinese characters > because the leading bytes e4-e9 in UTF-8 (mostly) cover CJK > ideographs. It's not a perfect filter, but blocking all UTF-8-encoded > subjects would yield way too many FPs for us. > > Regards, > > David. > > PS: I haven't looked at SA's Bayes implementation. Can it handle > words in non-western character sets properly? Thank you David, Jared and Jari. Adding: Subject:raw =~/=\?utf-8\?B/i Subject =~ /[\xe4-\xe9]/ caused this crap get caught. Both works, so I will keep David's advice. So I think I will just remove this TexCat plugin which does not identify it properly. This is great list, thanks again for everyone. All help appreciated. Jenny