Want 1-1 map between input files and output files in map-only job

Arun Luthra Thu, 19 Nov 2015 08:29:23 -0800

Hello,

Is there some technique for guaranteeing that there is a 1-1 correspondence
between the input files and the output files? For example if my input
directory has files called input001.txt, input002.txt, ... etc. I would
like Spark to generate output files named something like part-00001,
part-00002, etc., and input001.txt would correspond to the data mapped into
part-00001, and input002.txt corresponds to the data mapped into
part-00002, etc.


My program looks something like this:

val myrdd = sc.textFile("/data/input_directory")
myrdd map(myfunction) saveAsTextFile("/data/output_directory")


Thanks,
Arun

Want 1-1 map between input files and output files in map-only job

Reply via email to