Hello,

Is there some technique for guaranteeing that there is a 1-1 correspondence
between the input files and the output files? For example if my input
directory has files called input001.txt, input002.txt, ... etc. I would
like Spark to generate output files named something like part-00001,
part-00002, etc., and input001.txt would correspond to the data mapped into
part-00001, and input002.txt corresponds to the data mapped into
part-00002, etc.

My program looks something like this:

val myrdd = sc.textFile("/data/input_directory")
myrdd map(myfunction) saveAsTextFile("/data/output_directory")


Thanks,
Arun

Reply via email to