On Wed, 2013-11-27 at 13:38 -0500, Mauricio Tavares wrote: > Let's say I have > > ok_languages en > > and I get an email from Canada that is mostly in English but for the > little disclaimer on the bottom. How can I tell textcat to only flag > an email if more than some percentage of the body text is not in a > ok_languages?
I haven't actually used the TextCat plugin, but according to the documentation [1] "The rule UNWANTED_LANGUAGE_BODY is triggered if none of the languages detected are in the "ok" list." English is NOT one of the languages recognized. Given it fired the unwanted language rule, at least one language has been recognized with an acceptable score above the threshold. Your problem is not TextCat recognizing the other language (probably French), but TextCat failing to recognize English in that message. [1] http://spamassassin.apache.org/doc/Mail_SpamAssassin_Plugin_TextCat.html -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}