Hello again, Some progress has been made on this issue. From initial testing this patch has fixed my problem. I had my cluster running all night and the memory usage is floating around 700 MB. Before it would be > 1GB and climbing.
https://issues.apache.org/jira/browse/HIVE-7353 -Benjamin On Tue, Jul 8, 2014 at 5:51 AM, jonas.partner <jonas.part...@opencredo.com> wrote: > Hi Navis, > > after a run to the point where we are seeing exceptions we see the below > > num #instances #bytes class name > ---------------------------------------------- > 1: 502562 16081984 java.util.HashMap$Entry > 2: 459139 14692448 java.util.Hashtable$Entry > 3: 448636 10804840 [Ljava.lang.String; > 4: 57549 8853936 <constMethodKlass> > 5: 57549 7377040 <methodKlass> > 6: 4499 6273896 <constantPoolKlass> > 7: 109643 5151728 [C > 8: 923 4626864 [Ljava.util.Hashtable$Entry; > 9: 7783 3832416 [Ljava.util.HashMap$Entry; > 10: 4499 3214832 <instanceKlassKlass> > 11: 3563 3208160 <constantPoolCacheKlass> > 12: 41943 2612856 [Ljava.lang.Object; > 13: 107940 2590560 java.lang.String > 14: 14584 1934088 [S > 15: 76049 1825176 java.util.ArrayList > 16: 16139 1592936 [B > 17: 2303 1041680 <methodDataKlass> > 18: 40389 969336 javax.ws.rs.core.MediaType > 19: 33764 810336 > com.sun.jersey.core.spi.component.ProviderFactory$SingletonComponentProvider > 20: 33764 810336 > com.sun.jersey.core.spi.component.ComponentInjector > 21: 25690 616560 > com.sun.jersey.core.spi.factory.MessageBodyFactory$MessageBodyWriterPair > 22: 4877 613240 java.lang.Class > 23: 11504 552192 java.util.HashMap > 24: 33764 540224 > com.sun.jersey.core.spi.component.ComponentDestructor > 25: 14680 469760 > com.sun.jersey.core.util.KeyComparatorHashMap$Entry > 26: 7927 436512 [[I > 27: 2936 234880 > [Lcom.sun.jersey.core.util.KeyComparatorHashMap$Entry; > 28: 6886 220352 > java.util.concurrent.ConcurrentHashMap$HashEntry > 29: 8833 211992 java.util.LinkedList$Node > 30: 13212 211392 > com.sun.jersey.core.impl.provider.xml.ThreadLocalSingletonContextProvider$2 > 31: 644 204520 [I > 32: 8074 193776 > > com.sun.jersey.core.spi.factory.InjectableProviderFactory$MetaInjectableProvider > 33: 361 190608 <objArrayKlassKlass> > 34: 4178 167120 java.util.LinkedHashMap$Entry > 35: 8919 142704 java.lang.Object > 36: 2936 140928 > com.sun.jersey.core.util.KeyComparatorHashMap > 37: 2463 137928 java.util.LinkedHashMap > 38: 2959 118360 java.lang.ref.Finalizer > 39: 1345 107600 java.lang.reflect.Method > 40: 2341 93640 java.util.WeakHashMap$Entry > 41: 3670 88080 > com.sun.jersey.api.client.Client$ContextInjectableProvider > 42: 734 82208 > org.apache.hadoop.hive.ql.hooks.ATSHook$1 > 43: 2470 79040 > java.util.concurrent.locks.ReentrantLock$NonfairSync > 44: 747 77688 java.lang.Thread > 45: 2936 70464 > com.sun.jersey.core.impl.provider.xml.ThreadLocalSingletonContextProvider$1 > 46: 2202 70464 > org.codehaus.jackson.jaxrs.MapperConfigurator > 47: 752 68672 [Ljava.util.WeakHashMap$Entry; > 48: 218 65552 > [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; > 49: 1569 64656 [J > 50: 1613 64520 java.util.TreeMap$Entry > > where as at the start it looked like > > num #instances #bytes class name > ---------------------------------------------- > 1: 42631 6000544 <constMethodKlass> > 2: 42631 5467456 <methodKlass> > 3: 3107 4158440 <constantPoolKlass> > 4: 3107 2280128 <instanceKlassKlass> > 5: 2460 2099296 <constantPoolCacheKlass> > 6: 14455 1592728 [C > 7: 9631 951424 [B > 8: 3357 411624 java.lang.Class > 9: 5254 351232 [S > 10: 14281 342744 java.lang.String > 11: 5456 314848 [[I > 12: 654 309600 <methodDataKlass> > 13: 5692 182144 java.util.HashMap$Entry > 14: 4826 154432 java.util.Hashtable$Entry > 15: 4160 133120 > java.util.concurrent.ConcurrentHashMap$HashEntry > 16: 233 123024 <objArrayKlassKlass> > 17: 3654 95232 [Ljava.lang.String; > 18: 1044 88912 [Ljava.lang.Object; > 19: 357 72512 [Ljava.util.HashMap$Entry; > 20: 4037 64592 java.lang.Object > 21: 1608 64320 java.util.TreeMap$Entry > 22: 1566 62640 > com.google.common.collect.MapMakerInternalMap$WeakEntry > 23: 880 56320 java.net.URL > 24: 167 51048 [I > 25: 559 44720 java.lang.reflect.Method > 26: 86 40080 [Ljava.util.Hashtable$Entry; > 27: 163 36512 > [Ljava.util.concurrent.ConcurrentHashMap$HashEntry; > 28: 779 31160 java.util.LinkedHashMap$Entry > 29: 370 29664 [Ljava.util.WeakHashMap$Entry; > 30: 457 29248 > org.apache.hadoop.hive.conf.HiveConf$ConfVars > 31: 726 29040 java.lang.ref.Finalizer > 32: 335 26800 java.util.jar.JarFile$JarFileEntry > 33: 1566 25056 > com.google.common.collect.MapMakerInternalMap$StrongValueReference > 34: 351 22464 java.util.jar.JarFile > 35: 370 20720 java.util.WeakHashMap > 36: 438 17520 java.lang.ref.SoftReference > 37: 360 17280 sun.nio.cs.UTF_8$Encoder > 38: 358 17184 sun.misc.URLClassPath$JarLoader > 39: 337 16176 java.util.zip.Inflater > 40: 223 16056 java.lang.reflect.Constructor > 41: 328 15744 java.util.HashMap > 42: 537 12888 java.util.LinkedList$Node > 43: 539 12520 [Ljava.lang.Class; > 44: 384 12288 java.lang.ref.ReferenceQueue > 45: 357 11424 java.util.zip.ZipCoder > 46: 284 9088 java.util.LinkedList > 47: 357 8568 java.util.ArrayDeque > 48: 337 8088 java.util.zip.ZStreamRef > 49: 306 7344 java.lang.Long > 50: 34 7176 [Z > > > > On 8 July 2014 at 08:40:20, Navis류승우 (navis....@nexr.com > <//navis....@nexr.com>) wrote: > > Could you try "jmap -histo:live <pid>" and check hive objects which seemed > too many? > > Thanks, > Navis > > > 2014-07-07 22:22 GMT+09:00 jonas.partner <jonas.part...@opencredo.com>: > >> Hi Benjamin, >> Unfortunately this was a really critical issue for us and I didn’t think >> we would find a fix in time so we switched to generating a hive scripts >> programmatically then running that via an Oozie action which uses the Hive >> CLI. This seems to create a stable solution although is a lot less >> convenient than JDBC for our use case. >> >> I hope to find some more time to look at this later in the week since >> JDBC would simplify the solution. I would be very interested to hear if >> you make any progress. >> >> Regards >> >> Jonas >> >> On 7 July 2014 at 14:14:46, Benjamin Bowman (bbowman...@gmail.com >> <//bbowman...@gmail.com>) wrote: >> >> I believe I am having the same issue. Hive 0.13 and Hadoop 2.4. We >> had to increase the Hive heap to 4 GB which allows Hive to function for >> about 2-3 days. After that point it has consumed the entire heap and >> becomes unresponsive and/or throws OOM exceptions. We are using Beeline >> and HiveServer 2 and connect via JDBC to the database tens of thousands of >> times a day. >> >> I have been working with a developer at Hortonworks to find a solution >> but we have not come up with anything yet. Have you made any progress on >> this issue? >> >> Thanks, >> Benjamin >> >> >> On Thu, Jul 3, 2014 at 4:17 PM, jonas.partner < >> jonas.part...@opencredo.com> wrote: >> >>> Hi Edward, >>> >>> Thanks for the response. Sorry I posted the wrong version. I also >>> added close on the two result sets to the code taken from the wiki as >>> below but still the same problem. >>> >>> Will try to run it through your kit at the weekend. For the moment I >>> switched to running the statements as a script through the hive client (not >>> beeline) which seems stable even with hundreds of repetitions. >>> >>> Regards >>> >>> Jonas >>> >>> public static void run() throws SQLException { >>> try { >>> Class.forName(driverName); >>> } catch (ClassNotFoundException e) { >>> // TODO Auto-generated catch block >>> e.printStackTrace(); >>> System.exit(1); >>> } >>> //replace "hive" here with the name of the user the queries >>> should run as >>> Connection con = >>> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", >>> ""); >>> Statement stmt = con.createStatement(); >>> String tableName = "testHiveDriverTable"; >>> stmt.execute("drop table if exists " + tableName); >>> stmt.execute("create external table " + tableName + " (key >>> int, value string)"); >>> // show tables >>> String sql = "show tables '" + tableName + "'"; >>> System.out.println("Running: " + sql); >>> ResultSet res = stmt.executeQuery(sql); >>> if (res.next()) { >>> System.out.println(res.getString(1)); >>> } >>> res.close(); >>> // describe table >>> sql = "describe " + tableName; >>> System.out.println("Running: " + sql); >>> res = stmt.executeQuery(sql); >>> >>> while (res.next()) { >>> System.out.println(res.getString(1) + "\t" + >>> res.getString(2)); >>> } >>> res.close(); >>> stmt.close(); >>> con.close(); >>> } >>> >>> >>> >>> On 3 July 2014 at 21:05:25, Edward Capriolo (edlinuxg...@gmail.com >>> <//edlinuxg...@gmail.com>) wrote: >>> >>> Not saying there is not a leak elswhere but >>> statement and resultset objects both have .close() >>> >>> Java 7 now allows you to autoclose >>> try ( Connection conn ...; Statement st = conn.createStatement() ){ >>> something >>> } >>> >>> >>> On Thu, Jul 3, 2014 at 6:35 AM, jonas.partner < >>> jonas.part...@opencredo.com> wrote: >>> >>>> We have been struggling to get a reliable system working where we >>>> interact with Hive over JDBC a lot. The pattern we see is that everything >>>> starts ok but the memory used by the Hive server process grows over time >>>> and after some hundreds of operations we start to see exceptions. >>>> >>>> To ensure there was nothing stupid in our code causing this I took the >>>> example code from the wiki page for Hive 2 clients and put that in a loop. >>>> For us after about 80 runs we would see exceptions as below. >>>> >>>> 2014-04-21 07:31:02,251 ERROR [pool-5-thread-5]: >>>> server.TThreadPoolServer (TThreadPoolServer.java:run(215)) - Error occurred >>>> during processing of message. >>>> java.lang.RuntimeException: >>>> org.apache.thrift.transport.TTransportException >>>> at >>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) >>>> at >>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>> at java.lang.Thread.run(Thread.java:744) >>>> Caused by: org.apache.thrift.transport.TTransportException >>>> at >>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) >>>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>>> at >>>> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178) >>>> at >>>> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) >>>> at >>>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) >>>> at >>>> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) >>>> at >>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) >>>> ... 4 more >>>> >>>> This is also sometimes accompanied by out of memory exceptions. >>>> >>>> >>>> The code on the wiki did not close statements and adding that in >>>> changes the behaviour instead of exceptions things just lock up after a >>>> while and there is high CPU usage. >>>> >>>> This looks similar to HIVE-5296 >>>> <https://issues.apache.org/jira/browse/HIVE-5296> but that was fixed >>>> in 0.12 so should not be an issue in 0.13 I assume. Issues fixed in 0.13.1 >>>> don’t seem to relate to this either. The only way to get Hive back up and >>>> running is to restart. >>>> >>>> Before raising a JIRA I wanted to make sure I wasn’t missing >>>> something so any suggestions would be greatly appreciated. >>>> >>>> Full code as below. >>>> >>>> import java.sql.*; >>>> >>>> >>>> public class HiveOutOfMem { >>>> >>>> private static String driverName = >>>> "org.apache.hive.jdbc.HiveDriver"; >>>> >>>> >>>> public static void main(String[] args) throws SQLException{ >>>> for(int i =0; i < 100000; i++){ >>>> System.out.println("Run number " + i); >>>> run(); >>>> } >>>> } >>>> >>>> /** >>>> * @param >>>> * @throws SQLException >>>> */ >>>> public static void run() throws SQLException { >>>> try { >>>> Class.forName(driverName); >>>> } catch (ClassNotFoundException e) { >>>> // TODO Auto-generated catch block >>>> e.printStackTrace(); >>>> System.exit(1); >>>> } >>>> //replace "hive" here with the name of the user the queries >>>> should run as >>>> Connection con = >>>> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", >>>> ""); >>>> Statement stmt = con.createStatement(); >>>> String tableName = "testHiveDriverTable"; >>>> stmt.execute("drop table if exists " + tableName); >>>> stmt.execute("create external table " + tableName + " (key >>>> int, value string)"); >>>> // show tables >>>> String sql = "show tables '" + tableName + "'"; >>>> System.out.println("Running: " + sql); >>>> ResultSet res = stmt.executeQuery(sql); >>>> if (res.next()) { >>>> System.out.println(res.getString(1)); >>>> } >>>> >>>> // describe table >>>> sql = "describe " + tableName; >>>> System.out.println("Running: " + sql); >>>> res = stmt.executeQuery(sql); >>>> while (res.next()) { >>>> System.out.println(res.getString(1) + "\t" + >>>> res.getString(2)); >>>> } >>>> //stmt.close(); >>>> con.close(); >>>> } >>>> >>>> } >>>> >>> >>> >> >