Hi Justin & Ram,

To clarify, PipelineModel.stages is not private[ml]; only the PipelineModel
constructor is private[ml].  So it's safe to use pipelineModel.stages as a
Spark user.

Ram's example looks good.  Btw, in Spark 1.4 (and the current master
build), we've made a number of improvements to Params and Pipelines, so
this should become easier to use!

Joseph

On Sun, May 17, 2015 at 10:17 PM, Justin Yip <yipjus...@prediction.io>
wrote:

>
> Thanks Ram.
>
> Your sample look is very helpful. (there is a minor bug that
> PipelineModel.stages is hidden under private[ml], just need a wrapper
> around it. :)
>
> Justin
>
> On Sat, May 16, 2015 at 10:44 AM, Ram Sriharsha <sriharsha....@gmail.com>
> wrote:
>
>> Hi Justin
>>
>> The CrossValidatorExample here
>> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/CrossValidatorExample.scala
>> is a good example of how to set up an ML Pipeline for extracting a model
>> with the best parameter set.
>>
>> You set up the pipeline as in here:
>>
>> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/CrossValidatorExample.scala#L73
>>
>> This pipeline is treated as an estimator and wrapped into a Cross
>> Validator to do grid search and return the model with the best parameters .
>> Once you have trained the best model as in here
>>
>> https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/CrossValidatorExample.scala#L93
>>
>> The result is a CrossValidatorModel which contains the best estimator
>> (i.e. the best pipeline above) and you can extract the best pipeline and
>> inquire its parameters as follows:
>>
>> // what are the best parameters?
>> val bestPipelineModel = cvModel.bestModel.asInstanceOf[PipelineModel]
>> val stages = bestPipelineModel.stages
>>
>> val hashingStage = stages(1).asInstanceOf[HashingTF]
>> println(hashingStage.getNumFeatures)
>> val lrStage = stages(2).asInstanceOf[LogisticRegressionModel]
>> println(lrStage.getRegParam)
>>
>>
>>
>> Ram
>>
>> On Sat, May 16, 2015 at 3:17 AM, Justin Yip <yipjus...@prediction.io>
>> wrote:
>>
>>> Hello,
>>>
>>> I am using MLPipeline. I would like to extract the best parameter found
>>> by CrossValidator. But I cannot find much document about how to do it. Can
>>> anyone give me some pointers?
>>>
>>> Thanks.
>>>
>>> Justin
>>>
>>> ------------------------------
>>> View this message in context: Getting the best parameter set back from
>>> CrossValidatorModel
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/Getting-the-best-parameter-set-back-from-CrossValidatorModel-tp22915.html>
>>> Sent from the Apache Spark User List mailing list archive
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>>>
>>
>>
>

Reply via email to