Re: SparkML Using Pipeline API locally on driver

2016-02-28 Thread Yanbo Liang
Hi Jean, DataFrame is connected with SQLContext which is connected with SparkContext, so I think it's impossible to run `model.transform` without touching Spark. I think what you need is model should support prediction on single instance, then you can make prediction without Spark. You can track t

SparkML Using Pipeline API locally on driver

2016-02-26 Thread Eugene Morozov
Hi everyone. I have a requirement to run prediction for random forest model locally on a web-service without touching spark at all in some specific cases. I've achieved that with previous mllib API (java 8 syntax): public List> predictLocally(RandomForestModel model, List data) { retu