On Mon, 22 Aug 2016 07:16:41 -0700
Marc Perkel <supp...@junkemailfilter.com> wrote:

> Anthony, Yes - I don't store Set B. I store Set A. B is defined by 
> what's NOT in A. So I test A and if it's not matched it's set B. Set
> B is just a negative match on A.

Let me ask you a question.  As far as I understand your algorithm, if
an email contains at least one token in the "ham" set and zero tokens in
the "spam" set, you classify it as ham.  And conversely, if it contains
at least one spam token but zero ham tokens, you classify it as spam.

The other two possibilities (no tokens in either or some tokens in both)
are undecidable.

So.  What percentage of emails using your algorithm are actually decidable?

Regards,

Dianne.

Reply via email to