Edward J. Yoon wrote: > How do you to add input paths? > > On Wed, Apr 22, 2009 at 5:09 PM, nguyenhuynh.mr > <[email protected]> wrote: > >> Edward J. Yoon wrote: >> >> >>> Hi, >>> >>> In that case, The atomic unit of split is a file. So, you need to >>> increase the number of files. or Use the TextInputFormat as below. >>> >>> jobConf.setInputFormat(TextInputFormat.class); >>> >>> On Wed, Apr 22, 2009 at 4:35 PM, nguyenhuynh.mr >>> <[email protected]> wrote: >>> >>> >>>> Hi all! >>>> >>>> >>>> I have a MR job use to import contents into HBase. >>>> >>>> The content is text file in HDFS. I used the maps file to store local >>>> path of contents. >>>> >>>> Each content has the map file. ( the map is a text file in HDFS and >>>> contain 1 line info). >>>> >>>> >>>> I created the maps directory used to contain map files. And the this >>>> maps directory used to input path for job. >>>> >>>> When i run job, the number map task is same number map files. >>>> Ex: I have 5 maps file -> 5 map tasks. >>>> >>>> Therefor, the map phase is slowly :( >>>> >>>> Why the map phase is slowly if the number map task large and the number >>>> map task is equal number of files?. >>>> >>>> * p/s: Run jobs with: 3 node: 1 server and 2 slaver >>>> >>>> Please help me! >>>> Thanks. >>>> >>>> Best, >>>> Nguyen. >>>> >>>> >>>> >>>> >>>> >>> >>> >>> >> Current, I use TextInputformat to set InputFormat for map phase. >> >> > > > > Thanks for your help! I use FileInputFormat to add input paths. Some thing like: FileInputFormat.setInputPath(new Path("dir"));
The "dir" is a directory contains input files. Best, Nguyen
