At first look, it seems like a heap out of memory issue. But need to
perform detailed analysis to nail down the issue further. Hive logs can
provide more insights to it.

We gave a presentation in Hadoop Summit on "Debugging Hive with Hadoop".
The slide deck link is specified below. The slide# 42 and next couple of
slides detail on how to investigate hive struck jobs. Hope this helps.

Slide deck link:
http://www.slideshare.net/altiscale/debugging-hive-with-hadoop-in-the-cloud

--Bala G.


On Tue, Jul 8, 2014 at 1:18 PM, Tim Harsch <thar...@yarcdata.com> wrote:

> Hi,
> I asked a question on Stack Overflow
> (
> http://stackoverflow.com/questions/24621002/hive-job-stuck-at-map-100-redu
> ce-0) which hasn't seemed to get much traction, so I'd like to ask it here
> as well.
>
> I'm running hive-0.12.0 on hadoop-2.2.0. After submitting the query:
>
> select  i_item_desc
>   ,i_category
>   ,i_class
>   ,i_current_price
>   ,i_item_id
>   ,sum(ws_ext_sales_price) as itemrevenue
>   ,sum(ws_ext_sales_price)*100/sum(sum(ws_ext_sales_price)) over
>       (partition by i_class) as revenueratio
> from item JOIN web_sales ON (web_sales.ws_item_sk = item.i_item_sk) JOIN
> date_dim ON (web_sales.ws_sold_date_sk = date_dim.d_date_sk)
> where item.i_category in ('Jewelry', 'Sports', 'Books')
>     and date_dim.d_date between '2001-01-12' and '2001-02-11'
>     and ws_sold_date between '2001-01-12' and '2001-02-11'
> group by
>     i_item_id
>     ,i_item_desc
>     ,i_category
>     ,i_class
>     ,i_current_price
> order by
>     i_category
>     ,i_class
>     ,i_item_id
>     ,i_item_desc
>     ,revenueratio
> limit 100;
>
> I get the following errors in the logs:
>
> Hadoop job information for Stage-3: number of mappers: 1; number of
> reducers: 1
> 2014-07-07 15:26:16,893 Stage-3 map = 0%,  reduce = 0%
> 2014-07-07 15:26:22,033 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU
> 1.32 sec
>
> And then the last line repeats every second or so ad infinitum. If I look
> at container logs I see:
>
> 2014-07-07 17:12:17,477 INFO [AsyncDispatcher event handler]
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
> report from attempt_1404402886929_0036_m_000000_0: Container killed by the
> ApplicationMaster.
> Container killed on request. Exit code is 143
>
> I've searched for the Exit code 143, but most the stuff out there refers
> to memory issue and I have memory set pretty large (following the advice
> of Container is running beyond memory limits
> <
> http://stackoverflow.com/questions/21005643/container-is-running-beyond-me
> mory-limits>). I have even tried adding 6GB to each of the settings in
> that post, still no luck.
>
> I've also run the job with:
> hive -hiveconf hive.root.logger=DEBUG,console
> which really just produces alot more info, but nothing I see makes clear
> what the issue is.
> I'm not sure where else to look...
>
>

Reply via email to