Hi,
I am trying to do an outer join on to input files.
But while joining the TupleWritable value in the mapper is not getting cleaned
up and so is using the previous values of a different key.
The code I used is : ( 'plist' is containing the set of paths to be taken as
input )
jobConf.setInputFormat(CompositeInputFormat.class);
jobConf.set("mapred.join.expr", CompositeInputFormat.compose(op,
inputFormatClass,plist.toArray(new Path[0])));
jobConf.setOutputFormat(outputFormatClass);
inp1:
anil1 10
anil2 20
anil3 30
dev1 40
dev2 50
inp2:
anil1 100
dev1 400
dev2 500
dev3 600
outer join output:
anil1 10,100
anil2 20,100
anil3 30,100
dev1 40,400
dev2 50,500
dev3 50,600
Actually It should be, right?
anil1 10,100
anil2 20
anil3 30
dev1 40,400
dev2 50,500
dev3 600
Regards,
Devansh Rusia