Could you try "jmap -histo:live <pid>" and check hive objects which seemed too many?
Thanks, Navis 2014-07-07 22:22 GMT+09:00 jonas.partner <jonas.part...@opencredo.com>: > Hi Benjamin, > Unfortunately this was a really critical issue for us and I didn’t think > we would find a fix in time so we switched to generating a hive scripts > programmatically then running that via an Oozie action which uses the Hive > CLI. This seems to create a stable solution although is a lot less > convenient than JDBC for our use case. > > I hope to find some more time to look at this later in the week since JDBC > would simplify the solution. I would be very interested to hear if you > make any progress. > > Regards > > Jonas > > On 7 July 2014 at 14:14:46, Benjamin Bowman (bbowman...@gmail.com > <//bbowman...@gmail.com>) wrote: > > I believe I am having the same issue. Hive 0.13 and Hadoop 2.4. We had > to increase the Hive heap to 4 GB which allows Hive to function for about > 2-3 days. After that point it has consumed the entire heap and becomes > unresponsive and/or throws OOM exceptions. We are using Beeline and > HiveServer 2 and connect via JDBC to the database tens of thousands of > times a day. > > I have been working with a developer at Hortonworks to find a solution but > we have not come up with anything yet. Have you made any progress on this > issue? > > Thanks, > Benjamin > > > On Thu, Jul 3, 2014 at 4:17 PM, jonas.partner <jonas.part...@opencredo.com > > wrote: > >> Hi Edward, >> >> Thanks for the response. Sorry I posted the wrong version. I also added >> close on the two result sets to the code taken from the wiki as below but >> still the same problem. >> >> Will try to run it through your kit at the weekend. For the moment I >> switched to running the statements as a script through the hive client (not >> beeline) which seems stable even with hundreds of repetitions. >> >> Regards >> >> Jonas >> >> public static void run() throws SQLException { >> try { >> Class.forName(driverName); >> } catch (ClassNotFoundException e) { >> // TODO Auto-generated catch block >> e.printStackTrace(); >> System.exit(1); >> } >> //replace "hive" here with the name of the user the queries >> should run as >> Connection con = >> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", >> ""); >> Statement stmt = con.createStatement(); >> String tableName = "testHiveDriverTable"; >> stmt.execute("drop table if exists " + tableName); >> stmt.execute("create external table " + tableName + " (key >> int, value string)"); >> // show tables >> String sql = "show tables '" + tableName + "'"; >> System.out.println("Running: " + sql); >> ResultSet res = stmt.executeQuery(sql); >> if (res.next()) { >> System.out.println(res.getString(1)); >> } >> res.close(); >> // describe table >> sql = "describe " + tableName; >> System.out.println("Running: " + sql); >> res = stmt.executeQuery(sql); >> >> while (res.next()) { >> System.out.println(res.getString(1) + "\t" + >> res.getString(2)); >> } >> res.close(); >> stmt.close(); >> con.close(); >> } >> >> >> >> On 3 July 2014 at 21:05:25, Edward Capriolo (edlinuxg...@gmail.com >> <//edlinuxg...@gmail.com>) wrote: >> >> Not saying there is not a leak elswhere but >> statement and resultset objects both have .close() >> >> Java 7 now allows you to autoclose >> try ( Connection conn ...; Statement st = conn.createStatement() ){ >> something >> } >> >> >> On Thu, Jul 3, 2014 at 6:35 AM, jonas.partner < >> jonas.part...@opencredo.com> wrote: >> >>> We have been struggling to get a reliable system working where we >>> interact with Hive over JDBC a lot. The pattern we see is that everything >>> starts ok but the memory used by the Hive server process grows over time >>> and after some hundreds of operations we start to see exceptions. >>> >>> To ensure there was nothing stupid in our code causing this I took the >>> example code from the wiki page for Hive 2 clients and put that in a loop. >>> For us after about 80 runs we would see exceptions as below. >>> >>> 2014-04-21 07:31:02,251 ERROR [pool-5-thread-5]: >>> server.TThreadPoolServer (TThreadPoolServer.java:run(215)) - Error occurred >>> during processing of message. >>> java.lang.RuntimeException: >>> org.apache.thrift.transport.TTransportException >>> at >>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) >>> at >>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:744) >>> Caused by: org.apache.thrift.transport.TTransportException >>> at >>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) >>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>> at >>> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178) >>> at >>> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) >>> at >>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) >>> at >>> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) >>> at >>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) >>> ... 4 more >>> >>> This is also sometimes accompanied by out of memory exceptions. >>> >>> >>> The code on the wiki did not close statements and adding that in >>> changes the behaviour instead of exceptions things just lock up after a >>> while and there is high CPU usage. >>> >>> This looks similar to HIVE-5296 >>> <https://issues.apache.org/jira/browse/HIVE-5296> but that was fixed in >>> 0.12 so should not be an issue in 0.13 I assume. Issues fixed in 0.13.1 >>> don’t seem to relate to this either. The only way to get Hive back up and >>> running is to restart. >>> >>> Before raising a JIRA I wanted to make sure I wasn’t missing something >>> so any suggestions would be greatly appreciated. >>> >>> Full code as below. >>> >>> import java.sql.*; >>> >>> >>> public class HiveOutOfMem { >>> >>> private static String driverName = >>> "org.apache.hive.jdbc.HiveDriver"; >>> >>> >>> public static void main(String[] args) throws SQLException{ >>> for(int i =0; i < 100000; i++){ >>> System.out.println("Run number " + i); >>> run(); >>> } >>> } >>> >>> /** >>> * @param >>> * @throws SQLException >>> */ >>> public static void run() throws SQLException { >>> try { >>> Class.forName(driverName); >>> } catch (ClassNotFoundException e) { >>> // TODO Auto-generated catch block >>> e.printStackTrace(); >>> System.exit(1); >>> } >>> //replace "hive" here with the name of the user the queries >>> should run as >>> Connection con = >>> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive", >>> ""); >>> Statement stmt = con.createStatement(); >>> String tableName = "testHiveDriverTable"; >>> stmt.execute("drop table if exists " + tableName); >>> stmt.execute("create external table " + tableName + " (key >>> int, value string)"); >>> // show tables >>> String sql = "show tables '" + tableName + "'"; >>> System.out.println("Running: " + sql); >>> ResultSet res = stmt.executeQuery(sql); >>> if (res.next()) { >>> System.out.println(res.getString(1)); >>> } >>> >>> // describe table >>> sql = "describe " + tableName; >>> System.out.println("Running: " + sql); >>> res = stmt.executeQuery(sql); >>> while (res.next()) { >>> System.out.println(res.getString(1) + "\t" + >>> res.getString(2)); >>> } >>> //stmt.close(); >>> con.close(); >>> } >>> >>> } >>> >> >> >