Thanks, Michel.  How do you correct?  Run it back through as ham?
C

>>> On 1/11/2007 at 10:32 AM, in message <[EMAIL PROTECTED]>,
Michel R Vaillancourt <[EMAIL PROTECTED]> wrote:
Clay Davis wrote:
> Over the past several months I have been saving the spam that slips 
> through to my users accounts to train my bayes with.  I notice that 
> lately almost all of it has (what I am assuming to be) an attempt to

> poison my bayes (a bunch of valid words put together in a nonsensical

> paragraph) at the bottom of it.
>  
> How much should I worry about this type of spam and how it will
affect 
> my bayes db?  Work arounds?  Advice?
>  
> Thanks, gang.
>  
> Clay

Hi, Clay.  Without getting into the math behind it, Bayes poisoning is
almost impossible.  I have been training my Bayes DB with everything I
consider "spam", wether it has a "poison" section or not.  I'm almost
always seeing a BAYES_99 result on these "poisoned" emails.  Why? 
Because the key tokens that make it spam are repeated;  the "poison"
text is not.

I use a combination of auto-training and hand-correction with my DB.  I
only "correct" if the answer is not a BAYES_99.  Don't sweat the
"poison", Bayes is almost immune to Iocane, etc.

-- 
--Michel Vaillancourt
Wolfstar Systems
www.wolfstar.ca

Reply via email to