Arvid Ephraim Picciani wrote:
greetings.
any ideas for spam in russian and chineese? (some even with broken charset)
XBL and bayes are very effective but not enough :/
I'd like to have some kind of language matcher. We don't have people speaking
russian in the company so it would be nice to give 1 or 2 points on just the
language.
Well, SpamAssassin has two tools to help here..
ok_locales will check character sets. By default it allows everything,
but you can change it to only allow character sets that are appropriate
for your locale.
Also, there's the TextCat plugin, which you'd have to un-comment in
v310.pre. Once that's enabled, you can start using ok_languages, which
tries to guess at the language of a message based on character combinations.
Please read the docs closely, as there are a lot more languages than
locales, so what's valid for one isn't valid for the other. (There are
lots of languages that all use the same character sets.)
http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html#language_options
http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_TextCat.html