Hello Todd, My aim is to make the reduce move ahead with reduction as and when it gets the data required, instead of waiting for all the maps to complete. If it knows how many records it needs and compares it with number of records it has got until now, it can move on once they become equal without waiting for all the maps to finish.
So if i can know the number of records received from each file the MapOutputCopier has copied, then i can do this comparison. But now , the lengths received by the copier ( present in these variables ... long decompressedLength = Long.parseLong(connection.getHeaderField(RAW_MAP_OUTPUT_LENGTH)); long compressedLength = Long.parseLong(connection.getHeaderField(MAP_OUTPUT_LENGTH)); ) are not even multiples of the record size . I want to know how to get the number of records from these lengths. Iam doing all these as an experiment in my graduate research.. and if everything works.. i can come up with a contrib file for the same. Thanks, Naresh Rapolu. -- View this message in context: http://www.nabble.com/Re%3A-Need-help-understanding-the-source-tp24360327p24360327.html Sent from the Hadoop core-dev mailing list archive at Nabble.com.