Hello, I have been playing around with mllib's decision tree library. It is working great, thanks.
I have a question regarding overfitting. It appears to me that the current implementation doesn't allows user to specify the minimum number of samples per node. This results in some nodes only contain very few samples, which potentially leads to overfitting. I would like to know if there is workaround or any way to prevent overfitting? Or will decision tree supports min-samples-per-node in future releases? Thanks. Justin
