All,

We are running Hadoop 2.2.0 and Hive 0.13.0.  One typical application is to
load data (as text), and then convert that data to ORC to decrease query
time.  When running these processes we are seeing significant memory leaks
(leaking 4 GB in about 5 days).

We're running HiveServer2 with the following options:

root     28136     1 51 May14 ?        09:51:09 /usr/java/latest/bin/java
-Xmx2048m -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40
-XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit -XX:MaxPermSize=1024m
-XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15
-XX:-UseGCOverheadLimit -XX:MaxPermSize=1024m
-Dhadoop.log.dir=/opt/hadoop/latest-hadoop/logs
-Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/latest-hadoop
-Dhadoop.id.str= -Dhadoop.root.logger=INFO,console
-Djava.library.path=/opt/hadoop/latest-hadoop/lib/native
-Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true
-Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar
/opt/hadoop/latest-hive/lib/hive-service-0.13.0.jar
org.apache.hive.service.server.HiveServer2

Typical ORC conversion query looks like the following:

HIVE -u jdbc:hive2://hive_server:10002/db -n root --hiveconf
hive.exec.dynamic.partition.mode=nonstrict --hiveconf
hive.enforce.sorting=true --hiveconf $SET_QUEUE -e "insert into table
orc_table partition (range) select * from loading_text_table; "

I saw a couple of tickets for memory leaks, but they seemed to deal with
failed queries.  The memory usage increases in a linear fashion.  Jobs all
succeed until memory limit exceeded.

Is there an open bug for memory leaks associated with successful jobs in
HS2?  Is there a fix for this issue?

Regards,

Bryan Jeffrey

Reply via email to