On 18/05/12 07:54, dar...@chaosreigns.com wrote:
> Locale handling is a known problem is SA:
> https://issues.apache.org/SpamAssassin/show_bug.cgi?id=3062 

bug opened in 2004 :-(

I'm no linguist but this is probably an extremely hard problem to solve.
An email can have mixtures of languages, so in a perfect world we should
be able to change locale per word (or per char? - eeek!). This also
bleeds into the issues surrounding how "ok_locales" doesn't work (as
desired) in the modern UTF world too. ie SA would need to "know" what
locales an email contains (which helps ok_locales) so that it can then
dynamic change word boundary definitions/etc for rules. Yuck

Perhaps this should be just classified as a bug in perl and forgotten
about ;-) [does python,etc  handle this any better?]

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

Reply via email to