On Thu, Jul 13, 2006 at 03:17:05PM +0800, Joshua, C.S. Chen wrote:
> Hello folks,
> My users speak Chinese. I found that spamassassin seems not working well
> about chinese chset (utf8 or big5) on the bayes issue. Many normal mails
> (almost) get BAYES_99 score although the real spam also get BAYES_99. It
> looks like foreign language like Chinese is very easy to be high bayes
> scored.
> I have setup ok_locales all but it doesn't help the false-positive problem.
> 
> And another question: just wonder what if I do sa-learn --dump? Am I
> supposed to see the phrase that SA has learned? some key phrases, words
> in the spam mails? If so, can I see some chinese phrases?

Do you use chinese emails to "feed" the spamfilter both ham and spam
regularly?  That would probably be the best way to improve the accuracy
of the Bayesian filter.

Regards
Johann
-- 
Johann Spies          Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

     "Let your character be free from the love of money,
      being content with what you have; for He Himself has
      said, "I will never desert you, nor will I ever
      forsake you."
                              Hebrews 13:5

Reply via email to