I've been searching around and see others have asked similar questions. Given a schemaRDD I extract a restless that contains numbers, both Int and Doubles. How do I construct a RDD[Vector]? In 1.2 I wrote the results to a textile and then read them back in splitting them with some code I found in a ML book on Spark Analytics. That seems clunky. In 1.3 snapshot that flow doesn't even work as I couldn't find a call to write the data out to a file.
So I want to do two things with this data. Use the basic stats to extract out some basic metrics from the data. The other as mentioned above is the ML library. The ML Pipeline seems to use the new data frame API but the basic stats still requires RDD[Vector]. Thanks Mark -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Converting-SchemaRDD-Dataframe-to-RDD-vector-tp21835.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org