Try increasing the AM Container memory. set it to 2 gigs may be.

Regards,
Venkat

On Thu, Nov 14, 2019, 6:46 AM Sai Teja Desu <
saiteja.d...@globalfoundries.com> wrote:

> Hello All,
>
> I'm new to hive development and I'm memory limitation error for running a
> simple query with a predicate which should return only a few records. Below
> are the details of the Hive table, Query and Error. Please advise me on how
> to efficiently query on predicates which does not have partitions.
>
> Table Properties:     CREATE EXTERNAL TABLE TEST(location_id double,
>
> longitude double,
>
> latitude double,
>
> state string
>
> )
>
> COMMENT 'This table is created for testing purposes'
>
> PARTITIONED BY(country string, date string)
>
> STORED AS ORC
>
> LOCATION '<S3 Location>'
>
> Total records:  9 Billion Records
>
> Number of partitions: >4k
>
> EMR Cluster Properties:   Total Memory: 48 GB
>
> Number of Nodes: 2
>
> Total vCores: 8
>
> mapreduce.map.memory.mb=3072
>
> mapreduce.map.java.opts=-Xmx2458m
>
>
> Query Executed:  select * from test where location_id = 1234;
>
> Error:Status:  Failed
>
> Application  failed 2 times due to AM Container for exited with  exitCode:
> -104
>
> Failing this attempt.Diagnostics: Container is running beyond physical
> memory limits. Current usage: 1.1 GB of 1 GB physical memory used; 2.8 GB
> of 5 GB virtual memory used. Killing container.
>
> Dump of the process-tree for  :
>
>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>
>         |- 1253 1234 1234 123 (bash) 0 0 11597648 676 /bin/bash -c
> /usr/lib/jvm/java-openjdk/bin/java  -Xmx819m
> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/app30/container_11/tmp
> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA
> -XX:+UseParallelGC
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator
> -Dlog4j.configuration=tez-container-log4j.properties
> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11
> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel=''
> org.apache.tez.dag.app.DAGAppMaster --session
> 1>/var/log/hadoop-yarn/containers/application_10/container_11/stdout
> 2>/var/log/hadoop-yarn/containers/application_10/container_11/stderr
>
>         |- 1253 1234 1234 123  (java) 1253 1234 1234 123
>  /usr/lib/jvm/java-openjdk/bin/java -Xmx819m
> -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_10/container_11/tmp
> -server -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN
> -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA
> -XX:+UseParallelGC
> -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator
> -Dlog4j.configuration=tez-container-log4j.properties
> -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_10/container_11
> -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel=
> org.apache.tez.dag.app.DAGAppMaster --session
>
> Container killed on request. Exit code is 143
>
> Container exited with a non-zero exit code 143
>
>
>
>

Reply via email to