On Fri, 20 Feb 2015 21:36:38 +0100
Reindl Harald wrote:

> 

> > And I'd suggest the same for non-spam, train duplicative ham even
> > if it happens to be similarly addressed to different users. More
> > data is (nearly) always better for bayesian learning systems
> 
> of course

With the caveat that you keep an eye on retention.


> in doubt the amout of trained ham and spam should be near 50%, 


This is myth. What's important is to have enough of each, the actual
ratio is not important. 

Reply via email to