On Mon, 22 Aug 2016 09:03:38 -0700 Marc Perkel <supp...@junkemailfilter.com> wrote:
> The ones that are the same are of no interest. Only where it matches > one side and not the other. But... but... that's exactly like Bayes if you throw out tokens whose observed probability is not 0 or 1. Also, in your list of tokens, they are all phrases ranging from 1 to 4 words, and that's why you get good results. Multiword Bayes is just as good, and I know that from experience. Regards, Dianne.