Adding the right hive users alias.
On Fri, May 23, 2014 at 5:52 PM, Pala M Muthaia <mchett...@rocketfuelinc.com > wrote: > Hi, > > I am trying to run a relatively heavy Hive query that joins 3 tables. The > query succeeds on MR after increasing the mapper and reducer container > memory: > > set mapreduce.map.memory.mb=4096; > set mapreduce.reduce.memory.mb=8192; > > However, the same query, with same settings, on Tez, seems to get stuck in > Reducer 2. (The query is a join between 3 tables, hence has 3 Map and 2 > reduce nodes in the DAG). > > By stuck, i mean i see only the following in the container logs, for a > long time: > 2014-05-23 19:08:54,729 INFO [AMRM Callback Handler Thread] > org.apache.tez.dag.app.rm.TaskScheduler: App total resource memory: 0 cpu: > 0 taskAllocations: 301 > > > I need help with the following 2 questions: > > 1. Is there a separate setting for tez, to specify the amount of memory > for a container, equivalent to the *.memory.mb settings for mapreduce? > Maybe that value needs to be updated. > > 2. I already looked at the logs on the AM, and i only see the above log > statements. How do i get more information on why the Reduce node in the > query DAG is not progressing? Can i get more info from the reduce task > logs? How do i determine the machines on which the reduce tasks were > scheduled, so that i can look up the task logs, if any? The yarn resource > manager UI doesn't show such information. > > When I changed the amount of data to one of the large tables by > introducing sampling, and the query succeeded. I am suspecting memory > issue, but i am not sure how much memory was allocated in the first place. > > > Thanks. > -pala > > > >