Re: Reducer hanging ( swapping? )

john smith Thu, 22 Sep 2011 00:09:42 -0700

Hi,

I am CC'ing this to hive-user as well .


I tried to do a simple join between two tables 2.2GB and 137MB.

select count(*) from A JOIN B ON (A.a = B.b);

The query ran for 7 hours . I am sure this is not normal. The reducer gets
stuck at reduce > reduce phase . Map, copy phases complete just in a matter
of minutes and it gets stuck at reducer. Please see my previous mail below
for my config and vmstat output.

My job has 40 Maps and 7 reduces.

My JT and TT logs doesn't show any warnings, except that one of my nodes got
black listed because of Too many fetch failures.

Initially there was an error in that node's hosts file. I corrected it and
restarted the cluster. Even then that node gets blacklisted frequently.
Should I restart the node after changing hosts file?

Any help ? 7 hrs is too large for such a simple query.

On Thu, Sep 22, 2011 at 5:43 AM, Raj V <rajv...@yahoo.com> wrote:

> 2GB for a task tracker? Here are some possible thoughts.
> Compress  map output.
> Change  mapred.reduce.slowstart.completed.maps
>
>
> By the way I see no swapping.  Anything interesting from the task tracker
> log? System log?
>
> Raj
>
>
>
>
>
> >________________________________
> >From: john smith <js1987.sm...@gmail.com>
> >To: common-u...@hadoop.apache.org
> >Sent: Wednesday, September 21, 2011 4:52 PM
> >Subject: Reducer hanging ( swapping? )
> >
> >Hi Folks,
> >
> >I am running hive on a 10 node cluster. Since my hive queries have joins
> in
> >them, their reduce phases are a bit heavy.
> >
> >I have 2GB RAM on each TT . The problem is that my reducer hangs at 76%
> for
> >a large amount of time.  I guess this is due to excessive swapping from
> disk
> >to memory. My vmstat shows  (on one of the TTs)
> >
> >procs -----------memory---------- ---swap-- -----io---- -system--
> >----cpu----
> >r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
> >wa
> >1  0   1860  34884 189948 1997644    0    0     2     1    0    1  0  0
> 100
> >0
> >
> >My related config parms are pasted below. (I turned off speculative
> >execution for both maps and reduces). Can anyone suggest me
> >some improvements so as to make my reduce a bit faster?
> >(I've allotted 900MB to task and reduced other params. Even then it is not
> >showing any improvments.) . Any suggestions?
> >
> >========================================
> >
> ><property>
> ><name>mapred.min.split.size</name>
> ><value>65536</value>
> ></property>
> >
> >        <property>
> >                <name>mapred.reduce.copy.backoff</name>
> >                <value>5</value>
> >        </property>
> >
> >
> >    <property>
> >        <name>io.sort.factor</name>
> >        <value>60</value>
> >    </property>
> >
> >    <property>
> >        <name>mapred.reduce.parallel.copies</name>
> >        <value>25</value>
> >    </property>
> >
> >        <property>
> >                <name>io.sort.mb</name>
> >                <value>70</value>
> >        </property>
> >
> ><property>
> >        <name>io.file.buffer.size</name>
> >        <value>32768</value>
> >    </property>
> >
> ><property>
> >    <name>mapred.child.java.opts</name>
> >    <value>-Xmx900m</value>
> >  </property>
> >
> >===================================
> >
> >
> >
>

Re: Reducer hanging ( swapping? )

Reply via email to