Re: How to save spark-ML model in Java?

Asher Krim Thu, 12 Jan 2017 07:44:23 -0800

What version of Spark are you on?
Although it's cut off, I think your error is with RandomForestClassifier,
is that correct? If so, you should upgrade to spark 2 since I think this
class only became writeable/readable in Spark 2 (
https://github.com/apache/spark/pull/12118)


On Thu, Jan 12, 2017 at 8:43 AM, Md. Rezaul Karim <
rezaul.ka...@insight-centre.org> wrote:

> Hi Malshan,
>
> The error says that one (or more) of the estimators/stages is either not
> writable or compatible that supports overwrite/model write operation.
>
> Suppose you want to configure an ML pipeline consisting of three stages
> (i.e. estimator): tokenizer, hashingTF, and nb:
>     val nb = new NaiveBayes().setSmoothing(0.00001)
>     val tokenizer = new Tokenizer().setInputCol("
> label").setOutputCol("label")
>     val hashingTF = new 
> HashingTF().setInputCol(tokenizer.getOutputCol).setOutputCol("features")
>
>     val pipeline = new Pipeline().setStages(Array(tokenizer, hashingTF,
> nb))
>
>
> Now check if all the stages are writable. And to make it ease try saving
> stages individually:  -e.g. tokenizer.write.save("path")
>
>
> hashingTF.write.save("path")
> After that suppose you want to perform a 10-fold cross-validation as
> follows:
>     val cv = new CrossValidator()
>               .setEstimator(pipeline)
>               .setEvaluator(new BinaryClassificationEvaluator)
>               .setEstimatorParamMaps(paramGrid)
>               .setNumFolds(10)
>
> Where:
>     val paramGrid = new ParamGridBuilder()
>                             .addGrid(hashingTF.numFeatures, Array(10,
> 100, 1000))
>                             .addGrid(nb.smoothing, Array(0.001, 0.0001))
>                             .build()
>
> Now the model that you trained using the training set should be writable
> if all of the stages are okay:
>     val model = cv.fit(trainingData)
>     model.write.overwrite().save("output/NBModel")
>
>
>
> Hope that helps.
>
>
>
>
>
>
>
> Regards,
> _________________________________
> *Md. Rezaul Karim*, BSc, MSc
> PhD Researcher, INSIGHT Centre for Data Analytics
> National University of Ireland, Galway
> IDA Business Park, Dangan, Galway, Ireland
> Web: http://www.reza-analytics.eu/index.html
> <http://139.59.184.114/index.html>
>
> On 12 January 2017 at 09:09, Minudika Malshan <minudika...@gmail.com>
> wrote:
>
>> Hi,
>>
>> When I try to save a pipeline model using spark ML (Java) , the following
>> exception is thrown.
>>
>>
>> java.lang.UnsupportedOperationException: Pipeline write will fail on
>> this Pipeline because it contains a stage which does not implement
>> Writable. Non-Writable stage: rfc_98f8c9e0bd04 of type class
>> org.apache.spark.ml.classification.Rand
>>
>>
>> Here is my code segment.
>>
>>
>> model.write().overwrite,save
>>
>>
>> model.write().overwrite().save("path
>> model.write().overwrite().save("mypath");
>>
>>
>> How to resolve this?
>>
>> Thanks and regards!
>>
>> Minudika
>>
>>
>


-- 
Asher Krim
Senior Software Engineer

Re: How to save spark-ML model in Java?

Reply via email to