Sir ,
Actually i want to want to perform de duplication on input splits . so for
this, i have to perform content based chunking (using TTTD algorithm) on
each input split and and leave those chunks that are similar with previous
chunk and send only new chunks to map.
sir please tell me .. in which
thanks chris :)
On Tue, Apr 7, 2015 at 10:43 PM, Chris Nauroth
wrote:
> Hello Shahil,
>
> In the current trunk codebase, the relevant files are
> hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-co
> re/src/main/java/org/apache/hadoop/mapred/MapTask.java and
> hadoop-mapr
Hello Shahil,
In the current trunk codebase, the relevant files are
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-co
re/src/main/java/org/apache/hadoop/mapred/MapTask.java and
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-co
re/src/main/java/or