Hey Pau,

Thanks for the clarification. Yes, that helped to start the query, however
the query was taking huge time to retrieve a few records.

May I know what steps can I take to make this kind of query performance
better? I mean the predicates which does not have partitioning.

Thanks,
Sai.

On Thu, Nov 14, 2019 at 12:43 PM Pau Tallada <tall...@pic.es> wrote:

> Hi,
>
> The error is from the AM (Application Master), because it has soooooooo
> many partitions to orchestrate that needs lots of RAM.
> As Venkat said, try increasing tez.am.resource.memory.mb to 2G, even 4 or
> 8 might be needed.
>
> Cheers,
>
> Pau.
>
> Missatge de Sai Teja Desu <saiteja.d...@globalfoundries.com> del dia dj.,
> 14 de nov. 2019 a les 18:32:
>
>> Thanks for the reply Venkatesh. I did tried to increase the tez container
>> size to 4GB but still giving me the same error. In addition, below are the
>> settings I have tried:
>>
>> set mapreduce.map.memory.mb=4096;
>> set mapreduce.map.java.opts=-Xmx3686m;
>>
>>
>> set mapreduce.reduce.memory.mb=8192;
>> set mapreduce.reduce.java.opts=-Xmx7372m;
>>
>>
>> set hive.tez.container.size = 4096;
>> set hive.tez.java.opts =-Xmx3686m;
>>
>> Let me know if I'm missing anything or configuring incorrectly.
>>
>> Thanks,
>> Sai.
>>
>> On Thu, Nov 14, 2019 at 10:52 AM Venkatesh Selvaraj <
>> venkateshselva...@pinterest.com> wrote:
>>
>>> Try increasing the AM Container memory. set it to 2 gigs may be.
>>>
>>> Regards,
>>> Venkat
>>>
>>> On Thu, Nov 14, 2019, 6:46 AM Sai Teja Desu <
>>> saiteja.d...@globalfoundries.com> wrote:
>>>
>>>> Hello All,
>>>>
>>>> I'm new to hive development and I'm memory limitation error for running
>>>> a simple query with a predicate which should return only a few records.
>>>> Below are the details of the Hive table, Query and Error. Please advise me
>>>> on how to efficiently query on predicates which does not have partitions.
>>>>
>>>> Table Properties:     CREATE EXTERNAL TABLE TEST(location_id double,
>>>>
>>>> longitude double,
>>>>
>>>> latitude double,
>>>>
>>>> state string
>>>>
>>>> )
>>>>
>>>> COMMENT 'This table is created for testing purposes'
>>>>
>>>> PARTITIONED BY(country string, date string)
>>>>
>>>> STORED AS ORC
>>>>
>>>> LOCATION '<S3 Location>'
>>>>
>>>> Total records:  9 Billion Records
>>>>
>>>> Number of partitions: >4k
>>>>
>>>> EMR Cluster Properties:   Total Memory: 48 GB
>>>>
>>>> Number of Nodes: 2
>>>>
>>>> Total vCores: 8
>>>>
>>>> mapreduce.map.memory.mb=3072
>>>>
>>>> mapreduce.map.java.opts=-Xmx2458m
>>>>
>>>>
>>>> Query Executed:  select * from test where location_id = 1234;
>>>>
>>>> Error:Status:  Failed
>>>>
>>>> Application  failed 2 times due to AM Container for exited with
>>>>  exitCode: -104
>>>>
>>>> Failing this attempt.Diagnostics: Container is running beyond physical
>>>> memory limits. Current usage: 1.1 GB of 1 GB physical memory used; 2.8 GB
>>>> of 5 GB virtual memory used. Killing container.
>>>>
>>>> Dump of the process-tree for  :
>>>>
>>>>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
>>>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>>>>
>>>>         |- 1253 1234 1234 123 (bash) 0 0 11597648 676 /bin/bash -c
>>>> /usr/lib/jvm/java-openjdk/bin/java  -Xmx819m
>>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/app30/container_11/tmp
>>>> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
>>>> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA
>>>> -XX:+UseParallelGC
>>>> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator
>>>> -Dlog4j.configuration=tez-container-log4j.properties
>>>> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11
>>>> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel=''
>>>> org.apache.tez.dag.app.DAGAppMaster --session
>>>> 1>/var/log/hadoop-yarn/containers/application_10/container_11/stdout
>>>> 2>/var/log/hadoop-yarn/containers/application_10/container_11/stderr
>>>>
>>>>         |- 1253 1234 1234 123  (java) 1253 1234 1234 123
>>>>  /usr/lib/jvm/java-openjdk/bin/java -Xmx819m
>>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_10/container_11/tmp
>>>> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
>>>> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA
>>>> -XX:+UseParallelGC
>>>> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator
>>>> -Dlog4j.configuration=tez-container-log4j.properties
>>>> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11
>>>> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel=
>>>> org.apache.tez.dag.app.DAGAppMaster --session
>>>>
>>>> Container killed on request. Exit code is 143
>>>>
>>>> Container exited with a non-zero exit code 143
>>>>
>>>>
>>>>
>>>>
>
> --
> ----------------------------------
> Pau Tallada Crespí
> Dep. d'Astrofísica i Cosmologia
> Port d'Informació Científica (PIC)
> Tel: +34 93 170 2729
> ----------------------------------
>
>

Reply via email to