Mapper2 doesn't wait for Mapper1. They starts at the same time. It knows the "real" record by looking at the characters he reads. If he sees a newline, then that is the start of a "real" record. It discards all the stuff before that newline. Check the source code of LineRecordReader. You will get more detailed information for that.
________________________________ From: Zhong Wang <[email protected]> To: [email protected] Sent: Thursday, June 11, 2009 10:47:48 AM Subject: Re: Large size Text file split > Mapper 2 starts reading at byte 10000. It finds the first newline at byte > 10020, so the first "real" record it processes starts at byte 10021. > There's one problem: how does Mapper2 know the "real" record start at 10021 before Mapper1 reach the end of Split1 (9999)? Mappers starts at the same time. -- Zhong Wang
