On 09/28, dar...@chaosreigns.com wrote:
> On 09/28, Marc Perkel wrote:
> > You would only have to test the rule combinations that the message
> > actually triggered. So if it hit 10 rules then it would be 1024
> > combinations. Seems not to be unreasonable to me.

> combinations in the actual corpora would be much higher.  I'll try to
> get you a number.

360,468.  Combinations of rules seen in the actual mass-check corpora, from
the latest -net run (2011-09-24), after stripping out T_* and __* rules,
but not stripping out "tflags nopublish" rules.  So that would only take
about 394 times as much data submitted via mass-check as we currently have,
to maintain a similar level of accuracy :)

Seems likely I could find something useful in this direction though.
Looking for combinations of 2 or 3 rules that show up relatively often in
mis-categorized emails.

-- 
"Am I a man who dreamed I was a butterfly, or am I a butterfly who is
dreaming I am a man?" - Chuang Tsu, ~350 BC
http://www.ChaosReigns.com

Reply via email to