Hello Sparkers,

I'm reading data from a CSV  file, applying some transformations and ending
up with an RDD of pairs (String,Iterable<>).

I have already prepared Parquet files. I want now to take the previous
(key,value) RDD and populate the parquet files like follows:
- key holds the name of the Parquet file.
- value holds values to save in the parquet file whose name is the key.

I tried the simplest way that one can think of: creating a DataFrame inside
a 'map' or 'foreach' on the pair RDD, but this gave NullPointerException. I
read and found that this is because of nesting RDDs, which is not allowed.

Any help of how to achieve this in another way?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Filling-Parquet-files-by-values-in-Value-of-a-JavaPairRDD-tp23188.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to