You can save the cluster centers as a SchemaRDD of two columns (id: Int, center: Array[Double]). When you load it back, you can construct the k-means model from its cluster centers. -Xiangrui
On Tue, Jan 20, 2015 at 11:55 AM, Cheng Lian <[email protected]> wrote: > This is because KMeanModel is neither a built-in type nor a user defined > type recognized by Spark SQL. I think you can write your own UDT version of > KMeansModel in this case. You may refer to o.a.s.mllib.linalg.Vector and > o.a.s.mllib.linalg.VectorUDT as an example. > > Cheng > > On 1/20/15 5:34 AM, Divyansh Jain wrote: > > Hey people, > > I have run into some issues regarding saving the k-means mllib model in > Spark SQL by converting to a schema RDD. This is what I am doing: > > case class Model(id: String, model: > org.apache.spark.mllib.clustering.KMeansModel) > import sqlContext.createSchemaRDD > val rowRdd = sc.makeRDD(Seq("id", model)).map(p => Model("id", model)) > > This is the error that I get : > > scala.MatchError: org.apache.spark.mllib.classification.ClassificationModel > (of class scala.reflect.internal.Types$TypeRef$anon$6) > at > org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:53) > at > org.apache.spark.sql.catalyst.ScalaReflection$anonfun$schemaFor$1.apply(ScalaReflection.scala:64) > at > org.apache.spark.sql.catalyst.ScalaReflection$anonfun$schemaFor$1.apply(ScalaReflection.scala:62) > at > scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:244) > at scala.collection.immutable.List.foreach(List.scala:318) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:62) > at > org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:50) > at > org.apache.spark.sql.catalyst.ScalaReflection$.attributesFor(ScalaReflection.scala:44) > at > org.apache.spark.sql.execution.ExistingRdd$.fromProductRdd(basicOperators.scala:229) > at org.apache.spark.sql.SQLContext.createSchemaRDD(SQLContext.scala:94) > > Any help would be appreciated. Thanks! > > > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Saving-a-mllib-model-in-Spark-SQL-tp21264.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
