Am 12.04.2016 um 18:44 schrieb Yu Qian:
SpamAssassin used Bayes as classier, this is typical and efficient for English. But how does it processing languages like Asian language? Can anyone introduce that or anyone can show the code where SpamAssassin do that?
bayes is by definition language agnostic*you train* bayes with samples of ham and spam (at least a few hundret of both) and the tokenizer splits the messages in parts and creates a database which words appear how often in spam and ham (simplified explained)
signature.asc
Description: OpenPGP digital signature