Hi Rahul,
How did you set the configuration "mapred.line.input.format.linespermap"
and your input format? You have to set them in hadoop-site.xml or pass
them through -D option to the job.
NLineInputFormat will split N lines of input as one split. So, each map
gets N lines.
But the RecordReader is still LineRecordReader, which reads one line at
time, thereby Key is the offset in the file and Value is the line.
If you want N lines as Key, you may to override LineRecordReader.
Thanks
Amareshwari
Rahul Tenany wrote:
Hi, I am writing a Binary Search Tree on Hadoop and for the same i require
to use NLineInputFormat. I'll read n lines at a time, convert the numbers in
each line from string to int and then insert them into the binary tree. Once
the binary tree is made i'll search for elements in it. But even if i set
that input format as NLineInputFormat and set the
mapred.line.input.format.linespermap
to 10, i am able to read only 1 line at the time. Any idea where am i going
wrong? How can i find whether NLineInputFormat is working or not?
I want my program to work for any object that is comparable and not just
integers, so in there any way i can read NObjects at a time?
I am completely stuck. Any help will be appreciated.
Thanks
Rahul