Re: Saving each line of RDD as a separate file with key as the file name

2018-01-20 Thread Jörn Franke
Not sure if I understood exactly what you need, but you could have one partition by line. Alternatively you could use the MultipleOutput format in Hadoop. > On 20. Jan 2018, at 22:56, pooja bhojwani wrote: > > Hi all, > > So, I have a Java Pair RDD with let’s say n lines, each line has a uniq

Saving each line of RDD as a separate file with key as the file name

2018-01-20 Thread pooja bhojwani
Hi all, So, I have a Java Pair RDD with let’s say n lines, each line has a unique key and a hash map as the value(there are no duplicate keys). I want to save each line as a separate text file and since saveAsTextFile is not serializable, I need to somehow split the RDD into n RDD’s or so and save