Currently, spark-ml models and pipelines are only usable in Spark. This means you must use Spark's machinery (and pull in all its dependencies) to do model serving. Also currently there is no fast "predict" method for a single Vector instance.
So for now, you are best off going with PMML, or exporting your model in your own custom format, and re-loading it into your own custom format for serving. You can also take a look at PredictionIO (https://prediction.io/) for another serving option, or TensorFlow serving ( https://tensorflow.github.io/serving/). On Thu, 23 Jun 2016 at 13:40 philippe v <glaphili...@gmail.com> wrote: > Hello, > > I trained a linear regression model with spark-ml. I serialized the model > pipeline with classical java serialization. Then I loaded it in a > webservice > to compute predictions. > > For each request recieved by the webservice I create a 1 row dataframe to > compute that prediction. > > Probleme is that it take too much time.... > > Is there some good practices to do that kind of stuff ? > > I could export all model's coeffs with PMML and make computations in pure > java but I keep it in last resort. > > Does any one have some hints to increase performances ? > > Philippe > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Performance-issue-with-spark-ml-model-to-make-single-predictions-on-server-side-tp27217.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >