Re: Help Needed to handle Hive Error: Container [xxx] is running beyond physical memory limits.

Sai Teja Desu Fri, 15 Nov 2019 07:37:18 -0800

Thanks for your detailed explanation Pau.  The query actually never
returned even after 4 hours, I had to cancel the query. The reason might
be, I have too many small orc files as an input to Hive table.


Also, You are right my Cluster capacity is very less. But, do you suggest
we should keep on increasing the resources as the data grows.

Thanks,
Sai.

On Fri, Nov 15, 2019 at 9:41 AM Pau Tallada <tall...@pic.es> wrote:

> Hi Sai,
>
> Let me summarize some of your data:
>
> You have a 9 billion record table with 4 columns, which should account for
> a minimum raw size of about 200 GiB (not including string column)
> You want to select ALL columns from rows with a specific value in a column
> which is not partitioned, so Hive has to read EVERYTHING in order to
> deliver the results you request.
> It does not matter that the number of rows returned are a few, what it
> matters is that the cost of generating these few rows is high.
> And finally, your cluster only has 2 nodes and 8 cores in total.
>
> If each core is able to process 20 MiB/s of data, the query should take
> about 20 minutes to complete.
>
> You don't say how long is it taking, but you can surely increase the
> performance adding more resources (cores) to your cluster.
>
> Cheers,
>
> Pau.
>
>
>
>
> Missatge de Sai Teja Desu <saiteja.d...@globalfoundries.com> del dia dv.,
> 15 de nov. 2019 a les 15:21:
>
>> Hey Pau,
>>
>> Thanks for the clarification. Yes, that helped to start the query,
>> however the query was taking huge time to retrieve a few records.
>>
>> May I know what steps can I take to make this kind of query performance
>> better? I mean the predicates which does not have partitioning.
>>
>> Thanks,
>> Sai.
>>
>> On Thu, Nov 14, 2019 at 12:43 PM Pau Tallada <tall...@pic.es> wrote:
>>
>>> Hi,
>>>
>>> The error is from the AM (Application Master), because it has soooooooo
>>> many partitions to orchestrate that needs lots of RAM.
>>> As Venkat said, try increasing tez.am.resource.memory.mb to 2G, even 4
>>> or 8 might be needed.
>>>
>>> Cheers,
>>>
>>> Pau.
>>>
>>> Missatge de Sai Teja Desu <saiteja.d...@globalfoundries.com> del dia
>>> dj., 14 de nov. 2019 a les 18:32:
>>>
>>>> Thanks for the reply Venkatesh. I did tried to increase the tez
>>>> container size to 4GB but still giving me the same error. In addition,
>>>> below are the settings I have tried:
>>>>
>>>> set mapreduce.map.memory.mb=4096;
>>>> set mapreduce.map.java.opts=-Xmx3686m;
>>>>
>>>>
>>>> set mapreduce.reduce.memory.mb=8192;
>>>> set mapreduce.reduce.java.opts=-Xmx7372m;
>>>>
>>>>
>>>> set hive.tez.container.size = 4096;
>>>> set hive.tez.java.opts =-Xmx3686m;
>>>>
>>>> Let me know if I'm missing anything or configuring incorrectly.
>>>>
>>>> Thanks,
>>>> Sai.
>>>>
>>>> On Thu, Nov 14, 2019 at 10:52 AM Venkatesh Selvaraj <
>>>> venkateshselva...@pinterest.com> wrote:
>>>>
>>>>> Try increasing the AM Container memory. set it to 2 gigs may be.
>>>>>
>>>>> Regards,
>>>>> Venkat
>>>>>
>>>>> On Thu, Nov 14, 2019, 6:46 AM Sai Teja Desu <
>>>>> saiteja.d...@globalfoundries.com> wrote:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>> I'm new to hive development and I'm memory limitation error for
>>>>>> running a simple query with a predicate which should return only a
>>>>>> few records. Below are the details of the Hive table, Query and Error.
>>>>>> Please advise me on how to efficiently query on predicates which does not
>>>>>> have partitions.
>>>>>>
>>>>>> Table Properties:     CREATE EXTERNAL TABLE TEST(location_id double,
>>>>>>
>>>>>> longitude double,
>>>>>>
>>>>>> latitude double,
>>>>>>
>>>>>> state string
>>>>>>
>>>>>> )
>>>>>>
>>>>>> COMMENT 'This table is created for testing purposes'
>>>>>>
>>>>>> PARTITIONED BY(country string, date string)
>>>>>>
>>>>>> STORED AS ORC
>>>>>>
>>>>>> LOCATION '<S3 Location>'
>>>>>>
>>>>>> Total records:  9 Billion Records
>>>>>>
>>>>>> Number of partitions: >4k
>>>>>>
>>>>>> EMR Cluster Properties:   Total Memory: 48 GB
>>>>>>
>>>>>> Number of Nodes: 2
>>>>>>
>>>>>> Total vCores: 8
>>>>>>
>>>>>> mapreduce.map.memory.mb=3072
>>>>>>
>>>>>> mapreduce.map.java.opts=-Xmx2458m
>>>>>>
>>>>>>
>>>>>> Query Executed:  select * from test where location_id = 1234;
>>>>>>
>>>>>> Error:Status:  Failed
>>>>>>
>>>>>> Application  failed 2 times due to AM Container for exited with
>>>>>>  exitCode: -104
>>>>>>
>>>>>> Failing this attempt.Diagnostics: Container is running beyond
>>>>>> physical memory limits. Current usage: 1.1 GB of 1 GB physical memory 
>>>>>> used;
>>>>>> 2.8 GB of 5 GB virtual memory used. Killing container.
>>>>>>
>>>>>> Dump of the process-tree for  :
>>>>>>
>>>>>>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
>>>>>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>>>>>>
>>>>>>         |- 1253 1234 1234 123 (bash) 0 0 11597648 676 /bin/bash -c
>>>>>> /usr/lib/jvm/java-openjdk/bin/java  -Xmx819m
>>>>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/app30/container_11/tmp
>>>>>> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
>>>>>> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA
>>>>>> -XX:+UseParallelGC
>>>>>> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator
>>>>>> -Dlog4j.configuration=tez-container-log4j.properties
>>>>>> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11
>>>>>> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel=''
>>>>>> org.apache.tez.dag.app.DAGAppMaster --session
>>>>>> 1>/var/log/hadoop-yarn/containers/application_10/container_11/stdout
>>>>>> 2>/var/log/hadoop-yarn/containers/application_10/container_11/stderr
>>>>>>
>>>>>>         |- 1253 1234 1234 123  (java) 1253 1234 1234 123
>>>>>>  /usr/lib/jvm/java-openjdk/bin/java -Xmx819m
>>>>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_10/container_11/tmp
>>>>>> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
>>>>>> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA
>>>>>> -XX:+UseParallelGC
>>>>>> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator
>>>>>> -Dlog4j.configuration=tez-container-log4j.properties
>>>>>> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11
>>>>>> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel=
>>>>>> org.apache.tez.dag.app.DAGAppMaster --session
>>>>>>
>>>>>> Container killed on request. Exit code is 143
>>>>>>
>>>>>> Container exited with a non-zero exit code 143
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>
>>> --
>>> ----------------------------------
>>> Pau Tallada Crespí
>>> Dep. d'Astrofísica i Cosmologia
>>> Port d'Informació Científica (PIC)
>>> Tel: +34 93 170 2729
>>> ----------------------------------
>>>
>>>
>
> --
> ----------------------------------
> Pau Tallada Crespí
> Dep. d'Astrofísica i Cosmologia
> Port d'Informació Científica (PIC)
> Tel: +34 93 170 2729
> ----------------------------------
>
>

Re: Help Needed to handle Hive Error: Container [xxx] is running beyond physical memory limits.

Reply via email to