On Tue, 21 Aug 2012, Adam Moffett wrote:
One of our users definitely emails with Chinese vendors. I'm sure they
correspond in English, but I'm guessing the Chinese folks might have
Chinese characters in their signature line or some such.
Consider Bayes.
I have trained my Bayes with Chinese-language spams and they are all
getting BAYES_99 now. If you do decide to train on Chinese-language spams,
you will definitely want to also train hams from your user's Chinese
vendors to catch any use of non-latin characters in .sigs or message
headers.
Be sure to keep your training corpora on hand so that you can un-train
those messages if it doesn't work out.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
USMC Rules of Gunfighting #20: The faster you finish the fight,
the less shot you will get.
-----------------------------------------------------------------------
3 days until the 1933rd anniversary of the destruction of Pompeii