On Fri, 05 Mar 2010 18:39:25 +0100 Kai Schaetzl <mailli...@conactive.com> wrote:
> Alex wrote on Fri, 5 Mar 2010 11:02:35 -0500: > > > I've trained probably 50 of these, yet they still have BAYES_50. > > I trained your example and it went from 50 to 99. With *1* message! > There may be something wrong with your Bayes. With 400.000 tokens in > the db. There's nothing odd about that, it's common that hard to learn spam is identified correctly on retesting. The first time there are only weak tokens and tokens that aren't in the database (and mostly wont be seen again). The second time you have weak tokens plus dozens of new spam hapaxes which dominate the result.