All, I am running Hadoop 2.4 and Hive 0.13. I consistently run out of Hive heap space when running for a long period of time. If I bump up the heap memory - it will run longer, but still eventually throws an out of memory error and becomes unresponsive. The memory usage has a clearly linear trend which leads me to believe that there is a significant memory leak. Has any one else had this problem or have any insight as to why I am seeing this? Below I have included (1) the hive calls I am making, (2) the resulting java processes spawned off by these commands, and (3) the top 50 memory users according to a heap dump performed on Hive after it had run out of memory. Note that each of these calls is being run several times a minute.
Thanks, Benjamin Bowman (1) Hive/Hadoop calls: //Copying data to HDFS $HADOOP fs -copyFromLocal ${fullFile} $LOADING_DIR/${file} //Loading data $HIVE -u jdbc:hive2://$HIVE_MASTER:10002/database-n root -e "load data inpath '$LOADING_DIR/${file}' into table ${LOADING_TABLE} partition (range=$partitionDate);" //ORC data $HIVE -u jdbc:hive2://$HIVE_MASTER:10002/database -n root --hiveconf hive.exec.dynamic.partition.mode=nonstrict --hiveconf hive.enforce.sorting=true --hiveconf $SET_QUEUE -e "insert into table ${TABLE_NAME} partition (range) select * from ${LOADING_TABLE};" (2) Java processes: // Copying data to HDFS /usr/java/latest/bin/java -Xmx1000m -Dhadoop.log.dir=/opt/hadoop/latest-hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/latest-hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/hadoop/latest-hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.fs.FsShell -copyFromLocal /data/file1.txt /database/loading/file1.txt // Loading data /usr/java/latest/bin/java -Xmx4096m -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit -XX:MaxPermSize=1024m -Dhadoop.log.dir=/opt/hadoop/latest-hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/latest-hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/hadoop/latest-hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/hadoop/latest-hive/lib/hive-cli-0.13.0.jar org.apache.hive.beeline.BeeLine --hiveconf hive.aux.jars.path=file:///opt/hadoop/latest-hive/lib/hive-exec-0.13.0.jar,file:///opt/hadoop/latest-hive/hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.jar -u jdbc:hive2://HIVE_MASTER:10002/database -n root -e load data inpath '/database/loading/file1.txt' into table loading_table partition (range=1402506000); //ORC data /usr/java/latest/bin/java -Xmx4096m -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit -XX:MaxPermSize=1024m -Dhadoop.log.dir=/opt/hadoop/latest-hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/latest-hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/opt/hadoop/latest-hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/hadoop/latest-hive/lib/hive-cli-0.13.0.jar org.apache.hive.beeline.BeeLine --hiveconf hive.aux.jars.path=file:///opt/hadoop/latest-hive/lib/hive-exec-0.13.0.jar,file:///opt/hadoop/latest-hive/hcatalog/share/hcatalog/hive-hcatalog-core-0.13.0.jar -u jdbc:hive2://HIVE_MASTER:10002/database -n root --hiveconf hive.exec.dynamic.partition.mode=nonstrict --hiveconf hive.enforce.sorting=true --hiveconf mapred.job.queue.name=orc_queue -e insert into table TABLE_NAME partition (range) select * from loading_table; (3) Top 50 memory users of Hive heap space after 4 GB was consumed by Hive (Size is in Bytes) Class Count Size class [Ljava.util.HashMap$Entry; 6073306 1372508848 class [C 13007633 1034931150 class java.util.HashMap$Entry 19706437 551780236 class java.util.HashMap 6072908 291499584 class java.lang.String 13008539 260170780 class [Ljava.lang.Object; 2201432 180233440 class org.datanucleus.ExecutionContextThreadedImpl 439411 119959203 class com.mysql.jdbc.JDBC4ResultSet 352849 91740740 class [I 2433025 83659960 class org.datanucleus.FetchPlanForClass 2421672 79915176 class com.mysql.jdbc.StatementImpl 352825 75151725 class org.datanucleus.util.SoftValueMap$SoftValueReference 1483828 71223744 class org.datanucleus.FetchPlan 439412 33395312 class org.datanucleus.TransactionImpl 439411 33395236 class java.lang.ref.ReferenceQueue 1318447 31642728 class java.util.ArrayList 1759798 28156768 class org.datanucleus.api.jdo.JDOPersistenceManager 439411 25046427 class java.util.HashSet 2989733 23917864 class java.util.concurrent.locks.ReentrantLock$NonfairSync 745445 20872460 class java.util.IdentityHashMap 439419 19334436 class java.util.HashMap$KeyIterator 352852 14114080 class org.datanucleus.util.SoftValueMap 878903 14062448 class org.datanucleus.api.jdo.JDOCallbackHandler 439410 14061120 class java.util.HashMap$KeySet 1277338 10218704 class java.util.concurrent.ConcurrentHashMap$Segment 305936 9789952 class java.util.LinkedHashMap$Entry 219662 9665128 class org.apache.hadoop.fs.FileSystem$Statistics$StatisticsData 226336 8148096 class [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; 305936 7376992 class java.lang.ref.WeakReference 228210 7302720 class org.datanucleus.util.WeakValueMap 439441 7031056 class java.util.HashMap$Values 878871 7030968 class org.datanucleus.api.jdo.JDOTransaction 439411 7030576 class [Lcom.mysql.jdbc.Field; 352849 5645872 class java.util.LinkedList$Entry 226598 5438352 class org.datanucleus.ExecutionContextImpl$1 439411 5272932 class [B 37621 3863013 class java.util.HashMap$EntrySet 439686 3517488 class java.util.concurrent.locks.ReentrantLock 439465 3515720 class org.datanucleus.api.jdo.JDOFetchPlan 439411 3515288 class org.datanucleus.cache.SoftRefCache 439411 3515288 class org.datanucleus.properties.BasePropertyStore 439411 3515288 class java.util.IdentityHashMap$Values 439410 3515280 class [Ljava.util.concurrent.ConcurrentHashMap$Segment; 19126 2753504 class [S 21523 2370068 class java.util.Hashtable$Entry 71292 1996176 class org.datanucleus.state.LockManagerImpl 221721 1773768 class java.util.concurrent.atomic.AtomicBoolean 353119 1412476 class java.lang.Class 9397 1353168 class java.util.concurrent.ConcurrentHashMap 19126 1071056 class [Ljava.util.Hashtable$Entry; 544 1018368