Hi Justin & Ram, To clarify, PipelineModel.stages is not private[ml]; only the PipelineModel constructor is private[ml]. So it's safe to use pipelineModel.stages as a Spark user.
Ram's example looks good. Btw, in Spark 1.4 (and the current master build), we've made a number of improvements to Params and Pipelines, so this should become easier to use! Joseph On Sun, May 17, 2015 at 10:17 PM, Justin Yip <yipjus...@prediction.io> wrote: > > Thanks Ram. > > Your sample look is very helpful. (there is a minor bug that > PipelineModel.stages is hidden under private[ml], just need a wrapper > around it. :) > > Justin > > On Sat, May 16, 2015 at 10:44 AM, Ram Sriharsha <sriharsha....@gmail.com> > wrote: > >> Hi Justin >> >> The CrossValidatorExample here >> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/CrossValidatorExample.scala >> is a good example of how to set up an ML Pipeline for extracting a model >> with the best parameter set. >> >> You set up the pipeline as in here: >> >> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/CrossValidatorExample.scala#L73 >> >> This pipeline is treated as an estimator and wrapped into a Cross >> Validator to do grid search and return the model with the best parameters . >> Once you have trained the best model as in here >> >> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/CrossValidatorExample.scala#L93 >> >> The result is a CrossValidatorModel which contains the best estimator >> (i.e. the best pipeline above) and you can extract the best pipeline and >> inquire its parameters as follows: >> >> // what are the best parameters? >> val bestPipelineModel = cvModel.bestModel.asInstanceOf[PipelineModel] >> val stages = bestPipelineModel.stages >> >> val hashingStage = stages(1).asInstanceOf[HashingTF] >> println(hashingStage.getNumFeatures) >> val lrStage = stages(2).asInstanceOf[LogisticRegressionModel] >> println(lrStage.getRegParam) >> >> >> >> Ram >> >> On Sat, May 16, 2015 at 3:17 AM, Justin Yip <yipjus...@prediction.io> >> wrote: >> >>> Hello, >>> >>> I am using MLPipeline. I would like to extract the best parameter found >>> by CrossValidator. But I cannot find much document about how to do it. Can >>> anyone give me some pointers? >>> >>> Thanks. >>> >>> Justin >>> >>> ------------------------------ >>> View this message in context: Getting the best parameter set back from >>> CrossValidatorModel >>> <http://apache-spark-user-list.1001560.n3.nabble.com/Getting-the-best-parameter-set-back-from-CrossValidatorModel-tp22915.html> >>> Sent from the Apache Spark User List mailing list archive >>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >>> >> >> >