Hey Pau, Thanks for the clarification. Yes, that helped to start the query, however the query was taking huge time to retrieve a few records.
May I know what steps can I take to make this kind of query performance better? I mean the predicates which does not have partitioning. Thanks, Sai. On Thu, Nov 14, 2019 at 12:43 PM Pau Tallada <tall...@pic.es> wrote: > Hi, > > The error is from the AM (Application Master), because it has soooooooo > many partitions to orchestrate that needs lots of RAM. > As Venkat said, try increasing tez.am.resource.memory.mb to 2G, even 4 or > 8 might be needed. > > Cheers, > > Pau. > > Missatge de Sai Teja Desu <saiteja.d...@globalfoundries.com> del dia dj., > 14 de nov. 2019 a les 18:32: > >> Thanks for the reply Venkatesh. I did tried to increase the tez container >> size to 4GB but still giving me the same error. In addition, below are the >> settings I have tried: >> >> set mapreduce.map.memory.mb=4096; >> set mapreduce.map.java.opts=-Xmx3686m; >> >> >> set mapreduce.reduce.memory.mb=8192; >> set mapreduce.reduce.java.opts=-Xmx7372m; >> >> >> set hive.tez.container.size = 4096; >> set hive.tez.java.opts =-Xmx3686m; >> >> Let me know if I'm missing anything or configuring incorrectly. >> >> Thanks, >> Sai. >> >> On Thu, Nov 14, 2019 at 10:52 AM Venkatesh Selvaraj < >> venkateshselva...@pinterest.com> wrote: >> >>> Try increasing the AM Container memory. set it to 2 gigs may be. >>> >>> Regards, >>> Venkat >>> >>> On Thu, Nov 14, 2019, 6:46 AM Sai Teja Desu < >>> saiteja.d...@globalfoundries.com> wrote: >>> >>>> Hello All, >>>> >>>> I'm new to hive development and I'm memory limitation error for running >>>> a simple query with a predicate which should return only a few records. >>>> Below are the details of the Hive table, Query and Error. Please advise me >>>> on how to efficiently query on predicates which does not have partitions. >>>> >>>> Table Properties: CREATE EXTERNAL TABLE TEST(location_id double, >>>> >>>> longitude double, >>>> >>>> latitude double, >>>> >>>> state string >>>> >>>> ) >>>> >>>> COMMENT 'This table is created for testing purposes' >>>> >>>> PARTITIONED BY(country string, date string) >>>> >>>> STORED AS ORC >>>> >>>> LOCATION '<S3 Location>' >>>> >>>> Total records: 9 Billion Records >>>> >>>> Number of partitions: >4k >>>> >>>> EMR Cluster Properties: Total Memory: 48 GB >>>> >>>> Number of Nodes: 2 >>>> >>>> Total vCores: 8 >>>> >>>> mapreduce.map.memory.mb=3072 >>>> >>>> mapreduce.map.java.opts=-Xmx2458m >>>> >>>> >>>> Query Executed: select * from test where location_id = 1234; >>>> >>>> Error:Status: Failed >>>> >>>> Application failed 2 times due to AM Container for exited with >>>> exitCode: -104 >>>> >>>> Failing this attempt.Diagnostics: Container is running beyond physical >>>> memory limits. Current usage: 1.1 GB of 1 GB physical memory used; 2.8 GB >>>> of 5 GB virtual memory used. Killing container. >>>> >>>> Dump of the process-tree for : >>>> >>>> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) >>>> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE >>>> >>>> |- 1253 1234 1234 123 (bash) 0 0 11597648 676 /bin/bash -c >>>> /usr/lib/jvm/java-openjdk/bin/java -Xmx819m >>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/app30/container_11/tmp >>>> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN >>>> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA >>>> -XX:+UseParallelGC >>>> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >>>> -Dlog4j.configuration=tez-container-log4j.properties >>>> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11 >>>> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' >>>> org.apache.tez.dag.app.DAGAppMaster --session >>>> 1>/var/log/hadoop-yarn/containers/application_10/container_11/stdout >>>> 2>/var/log/hadoop-yarn/containers/application_10/container_11/stderr >>>> >>>> |- 1253 1234 1234 123 (java) 1253 1234 1234 123 >>>> /usr/lib/jvm/java-openjdk/bin/java -Xmx819m >>>> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_10/container_11/tmp >>>> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN >>>> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA >>>> -XX:+UseParallelGC >>>> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator >>>> -Dlog4j.configuration=tez-container-log4j.properties >>>> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11 >>>> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= >>>> org.apache.tez.dag.app.DAGAppMaster --session >>>> >>>> Container killed on request. Exit code is 143 >>>> >>>> Container exited with a non-zero exit code 143 >>>> >>>> >>>> >>>> > > -- > ---------------------------------- > Pau Tallada Crespí > Dep. d'Astrofísica i Cosmologia > Port d'Informació Científica (PIC) > Tel: +34 93 170 2729 > ---------------------------------- > >