Re: spam in foreign characters

2012-08-22 Thread Axb
On 08/21/2012 09:30 PM, Adam Moffett wrote: I have a user who seems to get 4-5 messages per day with Chinese characters for the subject and body. They come from a variety of domains and IP's so I guess she somehow got onto a list used to spam Chinese speaking people. If I paste them into Google

RE: spam in foreign characters

2012-08-22 Thread Daniel Lemke
> -Original Message- > From: Niamh Holding [mailto:ni...@fullbore.co.uk] > Sent: Wednesday, August 22, 2012 8:01 AM > To: users@spamassassin.apache.org > Subject: Re: spam in foreign characters > > > dcc> match all Chinese email if that's what you

Re: spam in foreign characters

2012-08-21 Thread Niamh Holding
Hello Darxus, Tuesday, August 21, 2012, 8:42:33 PM, you wrote: dcc> match all Chinese email if that's what you want mimeheader NH_CHINESE Content-Type =~ /charset="?gb2312/i score NH_CHINESE 2.5 describeNH_CHINESE Chinese character s

Re: spam in foreign characters

2012-08-21 Thread John Hardin
On Tue, 21 Aug 2012, Adam Moffett wrote: One of our users definitely emails with Chinese vendors. I'm sure they correspond in English, but I'm guessing the Chinese folks might have Chinese characters in their signature line or some such. Consider Bayes. I have trained my Bayes with Chinese-

Re: spam in foreign characters

2012-08-21 Thread Adam Moffett
I think I'd have to read Chinese to tackle that accurately. So, you should probably try using ok_locales, and if it doesn't work, create your own rules to match these spams, if you can find good common patterns that don't seem likely to match non-spams (or match all Chinese email if that's what

Re: spam in foreign characters

2012-08-21 Thread Adam Moffett
Awesome, thanks for the tip. Any guess how this affects messages with mixed character sets? One of our users definitely emails with Chinese vendors. I'm sure they correspond in English, but I'm guessing the Chinese folks might have Chinese characters in their signature line or some such. T

Re: spam in foreign characters

2012-08-21 Thread darxus
SpamAssassin has an ok_locales thing that allows you to specify basically languages you want to accept. But it has problems: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4078 I don't believe anybody has created rules to match these kinds of spams. A big part of the problem is lacking ex

spam in foreign characters

2012-08-21 Thread Adam Moffett
I have a user who seems to get 4-5 messages per day with Chinese characters for the subject and body. They come from a variety of domains and IP's so I guess she somehow got onto a list used to spam Chinese speaking people. If I paste them into Google Translate they seem to be roughly the sam