Just found Google dataproc has a preview of spark 2.0. Tried it and save/load works! Thanks Shuai. Followup question - is there a way to export the pyspark.ml models to PMML ? If not, what is the best way to integrate the model for inference in a production service ?
On Tue, Jul 19, 2016 at 8:22 PM Ajinkya Kale <kaleajin...@gmail.com> wrote: > I am using google cloud dataproc which comes with spark 1.6.1. So upgrade > is not really an option. > No way / hack to save the models in spark 1.6.1 ? > > On Tue, Jul 19, 2016 at 8:13 PM Shuai Lin <linshuai2...@gmail.com> wrote: > >> It's added in not-released-yet 2.0.0 version. >> >> https://issues.apache.org/jira/browse/SPARK-13036 >> https://github.com/apache/spark/commit/83302c3b >> >> so i guess you need to wait for 2.0 release (or use the current rc4). >> >> On Wed, Jul 20, 2016 at 6:54 AM, Ajinkya Kale <kaleajin...@gmail.com> >> wrote: >> >>> Is there a way to save a pyspark.ml.feature.PCA model ? I know mllib has >>> that but mllib does not have PCA afaik. How do people do model persistence >>> for inference using the pyspark ml models ? Did not find any documentation >>> on model persistency for ml. >>> >>> --ajinkya >>> >> >>