Thanks, Michel. How do you correct? Run it back through as ham? C >>> On 1/11/2007 at 10:32 AM, in message <[EMAIL PROTECTED]>, Michel R Vaillancourt <[EMAIL PROTECTED]> wrote: Clay Davis wrote: > Over the past several months I have been saving the spam that slips > through to my users accounts to train my bayes with. I notice that > lately almost all of it has (what I am assuming to be) an attempt to
> poison my bayes (a bunch of valid words put together in a nonsensical > paragraph) at the bottom of it. > > How much should I worry about this type of spam and how it will affect > my bayes db? Work arounds? Advice? > > Thanks, gang. > > Clay Hi, Clay. Without getting into the math behind it, Bayes poisoning is almost impossible. I have been training my Bayes DB with everything I consider "spam", wether it has a "poison" section or not. I'm almost always seeing a BAYES_99 result on these "poisoned" emails. Why? Because the key tokens that make it spam are repeated; the "poison" text is not. I use a combination of auto-training and hand-correction with my DB. I only "correct" if the answer is not a BAYES_99. Don't sweat the "poison", Bayes is almost immune to Iocane, etc. -- --Michel Vaillancourt Wolfstar Systems www.wolfstar.ca