Hi Raj, Could you share your code which can help others to diagnose this issue? Which version did you use? I can not reproduce this problem in my environment.
Thanks Yanbo 2016-02-26 10:49 GMT+08:00 raj.kumar <raj.ku...@hooklogic.com>: > Hi, > > I am using mllib. I use the ml vectorization tools to create the vectorized > input dataframe for > the ml/mllib machine-learning models with schema: > > root > |-- label: double (nullable = true) > |-- features: vector (nullable = true) > > To avoid repeated vectorization, I am trying to save and load this > dataframe > using > df.write.format("json").mode("overwrite").save( url ) > val data = Spark.sqlc.read.format("json").load( url ) > > However when I load the dataframe, the newly loaded dataframe has the > following schema: > root > |-- features: struct (nullable = true) > | |-- indices: array (nullable = true) > | | |-- element: long (containsNull = true) > | |-- size: long (nullable = true) > | |-- type: long (nullable = true) > | |-- values: array (nullable = true) > | | |-- element: double (containsNull = true) > |-- label: double (nullable = true) > > which the machine-learning models do not recognize. > > Is there a way I can save and load this dataframe without the schema > changing. > I assume it has to do with the fact that Vector is not a basic type. > > thanks > -Raj > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Saving-and-Loading-Dataframes-tp26339.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >