Edward J. Yoon wrote:

> How do you to add input paths?
>
> On Wed, Apr 22, 2009 at 5:09 PM, nguyenhuynh.mr
> <[email protected]> wrote:
>   
>> Edward J. Yoon wrote:
>>
>>     
>>> Hi,
>>>
>>> In that case, The atomic unit of split is a file. So, you need to
>>> increase the number of files. or Use the TextInputFormat as below.
>>>
>>> jobConf.setInputFormat(TextInputFormat.class);
>>>
>>> On Wed, Apr 22, 2009 at 4:35 PM, nguyenhuynh.mr
>>> <[email protected]> wrote:
>>>
>>>       
>>>> Hi all!
>>>>
>>>>
>>>> I have a MR job use to import contents into HBase.
>>>>
>>>> The content is text file in HDFS. I used the maps file to store local
>>>> path of contents.
>>>>
>>>> Each content has the map file. ( the map is a text file in HDFS and
>>>> contain 1 line info).
>>>>
>>>>
>>>> I created the maps directory used to contain map files. And the  this
>>>> maps directory used to input path for job.
>>>>
>>>> When i run job, the number map task is same number map files.
>>>> Ex: I have 5 maps file -> 5 map tasks.
>>>>
>>>> Therefor, the map phase is slowly :(
>>>>
>>>> Why the map phase is slowly if the number map task large and the number
>>>> map task is equal number of files?.
>>>>
>>>> * p/s: Run jobs with: 3 node: 1 server and 2 slaver
>>>>
>>>> Please help me!
>>>> Thanks.
>>>>
>>>> Best,
>>>> Nguyen.
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>
>>>
>>>       
>> Current, I use TextInputformat to set InputFormat for map phase.
>>
>>     
>
>
>
> Thanks for your help!
I use FileInputFormat to add input paths.
Some thing like:
    FileInputFormat.setInputPath(new Path("dir"));

The "dir" is a directory contains input files.

Best,
Nguyen


Reply via email to