is your input data compressed? if so then you will get one mapper per file
Miles 2009/4/21 javateck javateck <[email protected]>: > Hi Koji, > > Thanks for helping. > > I don't know why hadoop is just using 2 out of 10 map tasks slots. > > Sure, I just cut and paste the job tracker web UI, clearly I set the max > tasks to 10(which I can verify from hadoop-site.xml and from the individual > job configuration also), and I did have the first mapreduce running at 10 > map tasks when I checked from UI, but all subsequent queries are running > with 2 map tasks. And I have almost 176 files with each input file around > 62~75MB. > > > *mapred.tasktracker.map.tasks.maximum* 10 > > *Kind* > > *% Complete* > > *Num Tasks* > > *Pending* > > *Running* > > *Complete* > > *Killed* > > *Failed/Killed*<http://etsx18.apple.com:50030/jobfailures.jsp?jobid=job_200904211923_0025> > > *Task Attempts* > > *map* > > 28.04% > > > > 189 > > 134 > > 2 > > 53 > > 0 > > 0 / 0 > > *reduce* > > 0.00% > > > 1 > > 1<http://etsx18.apple.com:50030/jobtasks.jsp?jobid=job_200904211923_0025&type=reduce&pagenum=1&state=pending> > > 0 > > 0 > > 0 > > 0 / 0 > > * > * > > On Tue, Apr 21, 2009 at 1:56 PM, Koji Noguchi <[email protected]>wrote: > >> It's probably a silly question, but you do have more than 2 mappers on >> your second job? >> >> If yes, I have no idea what's happening. >> >> Koji >> >> -----Original Message----- >> From: javateck javateck [mailto:[email protected]] >> Sent: Tuesday, April 21, 2009 1:38 PM >> To: [email protected] >> Subject: Re: mapred.tasktracker.map.tasks.maximum >> >> right, I set it in hadoop-site.xml before starting the whole hadoop >> processes, I have one job running fully utilizing the 10 map tasks, but >> subsequent queries are only using 2 of them, don't know why. >> I have enough RAM also, no paging out is happening, I'm running on >> 0.18.3. >> Right now I put all processes on one machine, namenode, datanode, >> jobtracker, tasktracker, I have a 2*4core CPU, and 20GB RAM. >> >> >> On Tue, Apr 21, 2009 at 1:25 PM, Koji Noguchi >> <[email protected]>wrote: >> >> > This is a cluster config and not a per job config. >> > >> > So this has to be set when the mapreduce cluster first comes up. >> > >> > Koji >> > >> > >> > -----Original Message----- >> > From: javateck javateck [mailto:[email protected]] >> > Sent: Tuesday, April 21, 2009 1:20 PM >> > To: [email protected] >> > Subject: mapred.tasktracker.map.tasks.maximum >> > >> > I set my "mapred.tasktracker.map.tasks.maximum" to 10, but when I run >> a >> > task, it's only using 2 out of 10, any way to know why it's only using >> > 2? >> > thanks >> > >> > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
