MLLib : Decision Tree with minimum points per node

Justin Yip Fri, 13 Jun 2014 20:56:27 -0700

Hello,

I have been playing around with mllib's decision tree library. It is
working great, thanks.


I have a question regarding overfitting. It appears to me that the current
implementation doesn't allows user to specify the minimum number of samples
per node. This results in some nodes only contain very few samples, which
potentially leads to overfitting.

I would like to know if there is workaround or any way to prevent
overfitting? Or will decision tree supports min-samples-per-node in future
releases?

Thanks.

Justin

MLLib : Decision Tree with minimum points per node

Reply via email to