On Wed, 14 Feb 2018 16:20:30 +0100
Matus UHLAR - fantomas wrote:

> >On Tue, 13 Feb 2018 21:02:46 +0000
> >Horváth Szabolcs wrote:  
> >> One more question: is there a recommended ham to spam ratio? 1:1?  
> 
> On 14.02.18 15:09, RW wrote:
> >No, this is a myth.  Bayes computes token probabilities from a
> >token's frequencies in spam and ham, so it all scales through. If
> >you have 2000 ham and 200 spam the problem is too few spams, not a
> >bad ratio.  
> 
> my experience says you will need more ham than spam, because you want
> to get rid of false positives (ham marked as spam) much more than of
> false negatives.


My point is that an imbalance doesn't create a bias.

Reply via email to