Hi All I using spark ml Random Forest classifier, I have only two label categories (1, 0) ,about 30 features and data size over 100, 000. I run the spark JavaRandomForestClassifierExample code, the model came out with the results (I make some change, show more detail result): Test Error = 0.022321731460750338 Prediction results label = 1 count:951 Prediction results label = 0 count:13788 Prediction results predictedLabel = 1 and label = 1 count:682 Prediction results predictedLabel = 1 and label = 0 count:60 Prediction results predictedLabel = 0 and label = 1 count:269 Prediction Right = 0.7171398527865405 Prediction Miss= 0.28286014721345953 Prediction Wrong= 0.004351610095735422
I need to some advice about how to improve the accuracy , I tried to change classifier attributes , some like maxdepth, maxbins but doesn't change much. do I have to give more features ? or there is other ways to improve this ? Thanks