Depends on your data...
How did you split training and test set?
How does the model fit to the data?

You could try of course also to have more data to fed into the model....
Have you considered alternative machine learning models?

I do not think this is a Spark problem, but you should ask the machine learning 
specializing in your data and random forrest.


> On 18 Aug 2016, at 10:31, 陈哲 <czhenj...@gmail.com> wrote:
> 
> Hi All
>    I using spark ml Random Forest classifier, I have only two label 
> categories (1, 0) ,about 30 features and data size over 100, 000. I run the 
> spark JavaRandomForestClassifierExample code, the model came out with the 
> results (I make some change, show more detail result):
> Test Error = 0.022321731460750338
> Prediction results label = 1 count:951
> Prediction results label = 0 count:13788
> Prediction results predictedLabel = 1 and label = 1 count:682
> Prediction results predictedLabel = 1 and label = 0 count:60
> Prediction results predictedLabel = 0 and label = 1 count:269
> Prediction Right = 0.7171398527865405
> Prediction Miss= 0.28286014721345953
> Prediction Wrong= 0.004351610095735422
> 
> I need to some advice about how to improve the accuracy , I tried to change 
> classifier attributes , some like maxdepth, maxbins but doesn't change much.
> do I have to give more features ? or there is other ways to improve this ?
> 
> Thanks
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to