The TextInputFormat gives byte offset in the file as key and the entire line as value. so it won't work for you.
You can modify NLineInputFormat to achieve what you want. NLineInputformat gives each mapper N Lines (in your case N=500) Since you are interested in only first 500 lines of each file, the record reader for NLineInputFormat will be implemented as- get the input split check the start pos if start pos ==0 you will read the first 500 lines else you have got a file split that is in middle of the file, don't bother to read anything as the mapper that is reading from the beginning of the file is reading first 500 lines. Just indicate no more input. -Tarandeep On Fri, Jun 26, 2009 at 10:35 AM, Ramakishore Yelamanchilli < [email protected]> wrote: > I think map function gets the line number as key. You can ignore te other > lines after the key value 500. > > Thanks > > -----Original Message----- > From: Leiz [mailto:[email protected]] > Sent: Friday, June 26, 2009 8:57 AM > To: [email protected] > Subject: hwo to read a text file in Map function until reaching specific > line > > > For example , I have a text file with 1000 lines. > I only want to read the first 500 line of the file. > How can I do in Map function? > > Thanks > > > -- > View this message in context: > > http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching > -specific-line-tp24222783p24222783.html<http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching%0A-specific-line-tp24222783p24222783.html> > Sent from the Hadoop core-user mailing list archive at Nabble.com. > >
