Re: hwo to read a text file in Map function until reaching specific line

Tarandeep Singh Fri, 26 Jun 2009 11:31:17 -0700

The TextInputFormat gives byte offset in the file as key and the entire line
as value. so it won't work for you.

You can modify NLineInputFormat to achieve what you want. NLineInputformat
gives each mapper N Lines (in your case N=500)

Since you are interested in only first 500 lines of each file, the record
reader for NLineInputFormat will be implemented as-

get the input split
check the start pos
if start pos ==0
  you will read the first 500 lines
else
  you have got a file split that is in middle of the file, don't bother to
read anything as the mapper that is reading from the beginning of the file
is reading first 500 lines. Just indicate no more input.

-Tarandeep

On Fri, Jun 26, 2009 at 10:35 AM, Ramakishore Yelamanchilli <
[email protected]> wrote:

> I think map function gets the line number as key. You can ignore te other
> lines after the key value 500.
>
> Thanks
>
> -----Original Message-----
> From: Leiz [mailto:[email protected]]
> Sent: Friday, June 26, 2009 8:57 AM
> To: [email protected]
> Subject: hwo to read a text file in Map function until reaching specific
> line
>
>
> For example , I have a text file with 1000 lines.
> I only want to read the first 500 line of the file.
> How can I do in Map function?
>
> Thanks
>
>
> --
> View this message in context:
>
> http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching
> -specific-line-tp24222783p24222783.html<http://www.nabble.com/hwo-to-read-a-text-file-in-Map-function-until-reaching%0A-specific-line-tp24222783p24222783.html>
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Re: hwo to read a text file in Map function until reaching specific line

Reply via email to