Hi Koji, Thanks for helping.
I don't know why hadoop is just using 2 out of 10 map tasks slots. Sure, I just cut and paste the job tracker web UI, clearly I set the max tasks to 10(which I can verify from hadoop-site.xml and from the individual job configuration also), and I did have the first mapreduce running at 10 map tasks when I checked from UI, but all subsequent queries are running with 2 map tasks. And I have almost 176 files with each input file around 62~75MB. *mapred.tasktracker.map.tasks.maximum* 10 *Kind* *% Complete* *Num Tasks* *Pending* *Running* *Complete* *Killed* *Failed/Killed*<http://etsx18.apple.com:50030/jobfailures.jsp?jobid=job_200904211923_0025> *Task Attempts* *map* 28.04% 189 134 2 53 0 0 / 0 *reduce* 0.00% 1 1<http://etsx18.apple.com:50030/jobtasks.jsp?jobid=job_200904211923_0025&type=reduce&pagenum=1&state=pending> 0 0 0 0 / 0 * * On Tue, Apr 21, 2009 at 1:56 PM, Koji Noguchi <[email protected]>wrote: > It's probably a silly question, but you do have more than 2 mappers on > your second job? > > If yes, I have no idea what's happening. > > Koji > > -----Original Message----- > From: javateck javateck [mailto:[email protected]] > Sent: Tuesday, April 21, 2009 1:38 PM > To: [email protected] > Subject: Re: mapred.tasktracker.map.tasks.maximum > > right, I set it in hadoop-site.xml before starting the whole hadoop > processes, I have one job running fully utilizing the 10 map tasks, but > subsequent queries are only using 2 of them, don't know why. > I have enough RAM also, no paging out is happening, I'm running on > 0.18.3. > Right now I put all processes on one machine, namenode, datanode, > jobtracker, tasktracker, I have a 2*4core CPU, and 20GB RAM. > > > On Tue, Apr 21, 2009 at 1:25 PM, Koji Noguchi > <[email protected]>wrote: > > > This is a cluster config and not a per job config. > > > > So this has to be set when the mapreduce cluster first comes up. > > > > Koji > > > > > > -----Original Message----- > > From: javateck javateck [mailto:[email protected]] > > Sent: Tuesday, April 21, 2009 1:20 PM > > To: [email protected] > > Subject: mapred.tasktracker.map.tasks.maximum > > > > I set my "mapred.tasktracker.map.tasks.maximum" to 10, but when I run > a > > task, it's only using 2 out of 10, any way to know why it's only using > > 2? > > thanks > > >
