On 09/09/2011 02:16 AM, Alok Kushwaha wrote: >> I am using the 'SpamAssassin Server version 3.3.2' but 'Spanish >> spams' are getting through. Can anyone please suggest/point me the >> rule-set/plug-in for Spanish spams.
The short answer is to train bayes; it's far better at this sort of thing than anything else, even the language detection I'm about to suggest. Enable (un-comment) TextCat in v310.pre and then add this to your local.cf (adjust as needed): ok_languages en hi If that's not enough, create an anti-Spanish rule: header SPANISH_BODY X-Languages =~ /\bes/ (You'll have to verify that header name, I thought we always named our headers and pseudo-headers X-Spam-*. Also note that this is a pseudo-header, which means it doesn't show up in your emails unless you tell it to, e.g. with a line like "add_header all Languages _LANGUAGES_" though then it will always be named "X-Spam-Languages") See also the perldoc/man page for Mail::SpamAssassin::Plugin::TextCat Note that Spanish is not the easiest language to detect given its similarities to English in addition to the fact that most conversations are spattered with English words and even phrases. This can only do so much. Axb's solution is dangerous but might work for you: > you mean block ñ á é ó í and what else? the rest is quivalent to en So maybe something like: body __HAS_N_TILDE /[\xf1\xd1][a-z]/ body __HAS_A_ACUTE /[\xc1\xe1]/ body __HAS_E_ACUTE /[\xc9\xe9]/ body __HAS_I_ACUTE /[\xcd\xed]/ body __HAS_O_ACUTE /[\xd3\xf3]/ body __HAS_U_ACUTE /[\xda\xfa]/ body __HAS_LOS_LAS /\bl[ao]s\b/i body __HAS_DEL_DE_LA /\bde(?:l|\sla)\b/i body __HAS_ESTA_ESTE /\best[ae]\b/i body __HAS_PARA /\bpara\s/i meta MAYBE_SPANISH __HAS_N_TILDE + __HAS_A_ACUTE + __HAS_E_ACUTE + __HAS_I_ACUTE + __HAS_O_ACUTE + __HAS_U_ACUTE + __HAS_LOS_LAS + __HAS_DEL_DE_LA + __HAS_ESTA_ESTE + __HAS_PARA > 2 or maybe combining everything together; to all of the above, add: score MAYBE_SPANISH 0.0001 # Zero or multiple languages detected header __LANG_UNKNOWN X-Languages =~ /^\s*$|\w \w/ meta MAYBE_SPANISH2 SPANISH_BODY || (__LANG_UNKNOWN && MAYBE_SPANISH) score MAYBE_SPANISH2 1 When it comes to scoring, *always start small*. You can turn it up (slowly, in small increments!) once you know it's safe for you.
signature.asc
Description: OpenPGP digital signature