Hello Sparkers, I'm reading data from a CSV file, applying some transformations and ending up with an RDD of pairs (String,Iterable<>).
I have already prepared Parquet files. I want now to take the previous (key,value) RDD and populate the parquet files like follows: - key holds the name of the Parquet file. - value holds values to save in the parquet file whose name is the key. I tried the simplest way that one can think of: creating a DataFrame inside a 'map' or 'foreach' on the pair RDD, but this gave NullPointerException. I read and found that this is because of nesting RDDs, which is not allowed. Any help of how to achieve this in another way? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Filling-Parquet-files-by-values-in-Value-of-a-JavaPairRDD-tp23188.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org