Hi Sebastian, You can save models to disk and load them back up. In the snippet below (copied out of a working Databricks notebook), I train a model, then save it to disk, then retrieve it back into model2 from disk.
import org.apache.spark.mllib.tree.RandomForest > import org.apache.spark.mllib.tree.model.RandomForestModel > val model = RandomForest.trainClassifier(data, numClasses, > categoricalFeaturesInfo, > numTrees, featureSubsetStrategy, impurity, maxDepth, maxBins, seed) > model.save(sc, inputDir + "models/randomForestModel") > val model2 = RandomForestModel.load(sc, inputDir + > "models/randomForestModel") Not sure if there is PMML support. The model saves itself into a directory structure that looks like this: data/ > _SUCCESS > _common_metadata > _metadata > part-r-*.gz.parquet (multiple files) > metadata/ > _SUCCESS > part-00000 HTH -sujit On Thu, Oct 22, 2015 at 5:33 AM, Sebastian Kuepers < sebastian.kuep...@publicispixelpark.de> wrote: > Hey, > > I try to figure out the best practice on saving and loading models which > have bin fitted with the ML package - i.e. with the RandomForest > classifier. > > There is PMML support in the MLib package afaik but not in ML - is that > correct? > > How do you approach this, so that you do not have to fit your model before > every prediction job? > > Thanks, > Sebastian > > > Sebastian Küpers > Account Director > > Publicis Pixelpark > Leibnizstrasse 65, 10629 Berlin > T +49 30 5058 1838 > M +49 172 389 28 52 > sebastian.kuep...@publicispixelpark.de > Web: publicispixelpark.de, Twitter: @pubpxp > Facebook: publicispixelpark.de/facebook > Publicis Pixelpark - eine Marke der Pixelpark AG > Vorstand: Horst Wagner (Vorsitzender), Dirk Kedrowitsch > Aufsichtsratsvorsitzender: Pedro Simko > Amtsgericht Charlottenburg: HRB 72163 > > > > > > ------------------------------------------------------------------------ > Disclaimer The information in this email and any attachments may contain > proprietary and confidential information that is intended for the > addressee(s) only. If you are not the intended recipient, you are hereby > notified that any disclosure, copying, distribution, retention or use of > the contents of this information is prohibited. When addressed to our > clients or vendors, any information contained in this e-mail or any > attachments is subject to the terms and conditions in any governing > contract. If you have received this e-mail in error, please immediately > contact the sender and delete the e-mail. >