Jean-Daniel, I realize that, and my question was, is this the normal setup/finishup time, about 2 minutes? If it is, then fine. I would expect that on tasks taking 10-15 minutes, 2 minutes would be totally justified, and I think that this is the guideline - each task should take minutes.
Thank you, Mark On Mon, Apr 20, 2009 at 7:42 AM, Jean-Daniel Cryans <[email protected]>wrote: > Mark, > > There is a setup price when using Hadoop, for each task a new JVM must > be spawned. On such a small scale, you won't see any good using MR. > > J-D > > On Mon, Apr 20, 2009 at 12:26 AM, Mark Kerzner <[email protected]> > wrote: > > Hi, > > > > I ran a Hadoop MapReduce task in the local mode, reading and writing from > > HDFS, and it took 2.5 minutes. Essentially the same operations on the > local > > file system without MapReduce took 1/2 minute. Is this to be expected? > > > > It seemed that the system lost most of the time in the MapReduce > operation, > > such as after these messages > > > > 09/04/19 23:23:01 INFO mapred.LocalJobRunner: reduce > reduce > > 09/04/19 23:23:01 INFO mapred.JobClient: map 100% reduce 92% > > 09/04/19 23:23:04 INFO mapred.LocalJobRunner: reduce > reduce > > > > it waited for a long time. The final output lines were > > > > 09/04/19 23:24:12 INFO mapred.LocalJobRunner: reduce > reduce > > 09/04/19 23:24:12 INFO mapred.TaskRunner: Task > > 'attempt_local_0001_r_000000_0' done. > > 09/04/19 23:24:12 INFO mapred.TaskRunner: Saved output of task > > 'attempt_local_0001_r_000000_0' to hdfs://localhost/output > > 09/04/19 23:24:13 INFO mapred.JobClient: Job complete: job_local_0001 > > 09/04/19 23:24:13 INFO mapred.JobClient: Counters: 13 > > 09/04/19 23:24:13 INFO mapred.JobClient: File Systems > > 09/04/19 23:24:13 INFO mapred.JobClient: HDFS bytes read=138103444 > > 09/04/19 23:24:13 INFO mapred.JobClient: HDFS bytes written=107357785 > > 09/04/19 23:24:13 INFO mapred.JobClient: Local bytes read=282509133 > > 09/04/19 23:24:13 INFO mapred.JobClient: Local bytes > written=376697552 > > 09/04/19 23:24:13 INFO mapred.JobClient: Map-Reduce Framework > > 09/04/19 23:24:13 INFO mapred.JobClient: Reduce input groups=184 > > 09/04/19 23:24:13 INFO mapred.JobClient: Combine output records=185 > > 09/04/19 23:24:13 INFO mapred.JobClient: Map input records=209 > > 09/04/19 23:24:13 INFO mapred.JobClient: Reduce output records=184 > > 09/04/19 23:24:13 INFO mapred.JobClient: Map output bytes=91863989 > > 09/04/19 23:24:13 INFO mapred.JobClient: Map input bytes=69051592 > > 09/04/19 23:24:13 INFO mapred.JobClient: Combine input records=185 > > 09/04/19 23:24:13 INFO mapred.JobClient: Map output records=209 > > 09/04/19 23:24:13 INFO mapred.JobClient: Reduce input records=184 > > >
