Edward J. Yoon wrote: > As far as I know, FileInputFormat.getSplits() will returns the number > of splits automatically computed by the number of files, blocks. BTW, > What version of Hadoop/Hbase? > > I tried to test that code > (http://wiki.apache.org/hadoop/Hbase/MapReduce) on my cluster (Hadoop > 0.19.1 and Hbase 0.19.0). The number of input paths was 2, map tasks > were 274. > > Below is my changed code for v0.19.0. > --- > public JobConf createSubmittableJob(String[] args) { > JobConf c = new JobConf(getConf(), TestImport.class); > c.setJobName(NAME); > FileInputFormat.setInputPaths(c, args[0]); > > c.set("input.table", args[1]); > c.setMapperClass(InnerMap.class); > c.setNumReduceTasks(0); > c.setOutputFormat(NullOutputFormat.class); > return c; > } > > > > On Thu, Apr 23, 2009 at 6:19 PM, nguyenhuynh.mr > <[email protected]> wrote: > >> Edward J. Yoon wrote: >> >> >>> How do you to add input paths? >>> >>> On Wed, Apr 22, 2009 at 5:09 PM, nguyenhuynh.mr >>> <[email protected]> wrote: >>> >>> >>>> Edward J. Yoon wrote: >>>> >>>> >>>> >>>>> Hi, >>>>> >>>>> In that case, The atomic unit of split is a file. So, you need to >>>>> increase the number of files. or Use the TextInputFormat as below. >>>>> >>>>> jobConf.setInputFormat(TextInputFormat.class); >>>>> >>>>> On Wed, Apr 22, 2009 at 4:35 PM, nguyenhuynh.mr >>>>> <[email protected]> wrote: >>>>> >>>>> >>>>> >>>>>> Hi all! >>>>>> >>>>>> >>>>>> I have a MR job use to import contents into HBase. >>>>>> >>>>>> The content is text file in HDFS. I used the maps file to store local >>>>>> path of contents. >>>>>> >>>>>> Each content has the map file. ( the map is a text file in HDFS and >>>>>> contain 1 line info). >>>>>> >>>>>> >>>>>> I created the maps directory used to contain map files. And the this >>>>>> maps directory used to input path for job. >>>>>> >>>>>> When i run job, the number map task is same number map files. >>>>>> Ex: I have 5 maps file -> 5 map tasks. >>>>>> >>>>>> Therefor, the map phase is slowly :( >>>>>> >>>>>> Why the map phase is slowly if the number map task large and the number >>>>>> map task is equal number of files?. >>>>>> >>>>>> * p/s: Run jobs with: 3 node: 1 server and 2 slaver >>>>>> >>>>>> Please help me! >>>>>> Thanks. >>>>>> >>>>>> Best, >>>>>> Nguyen. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> Current, I use TextInputformat to set InputFormat for map phase. >>>> >>>> >>>> >>> >>> Thanks for your help! >>> >> I use FileInputFormat to add input paths. >> Some thing like: >> FileInputFormat.setInputPath(new Path("dir")); >> >> The "dir" is a directory contains input files. >> >> Best, >> Nguyen >> >> >> >> Thanks!
I am using Hadoop version 0.18.2 Cheer, Nguyen.
