Stand H wrote: > > I'm not sure if I can feed non-english email to > sa-learn. I have default ok_languages and ok_locales > and Mail::SpamAssassin::Plugin::TextCat also loaded. > Can I train it? > > I have read the man page of TextCat but still don't > understand the concept of language support of > Spamassassin. > > Here is the quota from TextCat man page: > "You can then specify which languages are considered > okay for incoming mail and if the guessed language is > not okay, UNWANTED_LANGUAGE_BODY is triggered" > > If languages are considered ok what will happen? Any > effect on other (rule) checking? > > What happens if UNWANTED_LANGUAGE_BODY is triggered? > > Can anyone pls explain to me in details about this?
The Bayes database that is updated via sa-learn is completely independent of your other SpamAssassin settings. Let it learn as much ham and spam as you can manage and don't worry about languages. TextCat can help if you only want messages in certain languages. It can add points to messages in a language that you don't want. It attempts to determine which language the message was written in. Then you use the ok_languages setting to specify the languages you wish to allow. Messages in other languages will have points added. I haven't investigated the detailed use of TextCat, but the above is a simple description of what it does. ok_locales is slightly different. Instead of trying to guess the language, it simply uses the character set. You can list the languages that you wish to use and that will cause messages using character sets from other languages to have points added. -- Bowie