Thanks Carl, I'll check that. But, surely, I cant be the only one running Hive queries which last more than 10 minutes over a thrift client! The hive model is somewhat intended to work with large data sets and long running queries should be expected. I wonder why there is no discussion around this on the mailing list, that I could find.
-ayush On Fri, Feb 25, 2011 at 12:54 PM, Carl Steinbach <c...@cloudera.com> wrote: > I filed a JIRA ticket to track the task of making the Thrift socket timeout > configurable: > > https://issues.apache.org/jira/browse/HIVE-2006 > > > On Thu, Feb 24, 2011 at 11:21 PM, Carl Steinbach <c...@cloudera.com>wrote: > >> Hi Ayush, >> >> I suspect you're running into Thrift's default socket timeout setting. I >> recommend checking out a copy of the Hive source code, and modifying the >> Thrift setup code in HiveServer.java to explicitly set the socket timeout on >> the TServerSocket, e.g. in HiveServer.main() change >> >> TServerTransport serverTransport = new TServerSocket(port); >> >> to >> >> TServerTransport serverTransport = new TServerSocket(port, timeout); >> >> The Thrift javadoc doesn't specify whether timeout is in seconds or >> milliseconds, so you'll probably have to play around with this value. >> >> Hope this helps. >> >> Carl >> >> >> On Thu, Feb 24, 2011 at 10:52 PM, Ayush Gupta <ay...@glugbot.com> wrote: >> >>> >>> On Fri, Feb 25, 2011 at 12:17 PM, Viral Bajaria <viral.baja...@gmail.com >>> > wrote: >>> >>>> What do the logs of the thrift server say ?? If it does not give any >>>> relevant information, I would enable DEBUG level logging on the console. >>> >>> the hiveserver is pretty quiet, the connection appears to be terminated >>> silently. I'll up the logging to DEBUG, thanks for that suggestion. >>> >>> >>> >>> >>>> Also a point to remember is the single-threaded nature of the hive >>>> thrift server (atleast upto v0.5) >>>> >>> yeah, there is only this one client connected in this scenario. >>> >>> >>>> But looking at the logs is what will be the first thing that I would >>>> do. >>>> >>> >>>> The query (map/reduce job) will continue to run even if you shutdown the >>>> server since a shutdown does not kill the job submitted to the JobTracker. >>>> >>> sure >>> >>> >>>> >>>> On Thu, Feb 24, 2011 at 9:36 PM, Ayush Gupta <ay...@glugbot.com> wrote: >>>> >>>>> Probing this further reveals that the connection is reset by the server >>>>> in exactly 10 minutes every time. >>>>> >>>>> I'm running Hive 0.6. I do not see anything relevant at >>>>> http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is >>>>> there some configuration property which controls this? >>>>> >>>>> -ayush >>>>> >>>>> >>>>> On Fri, Feb 25, 2011 at 8:23 AM, Ayush Gupta <ay...@glugbot.com>wrote: >>>>> >>>>>> Hi! I'm having some trouble running queries from a java client against >>>>>> a remote Thrift Hive server. Its all setup and quicker queries do run >>>>>> through fine. >>>>>> >>>>>> But queries which run longer than about 10 minutes disconnect the >>>>>> client with a "TTransportException: Connection reset" exception.. The >>>>>> query >>>>>> continues to run on the Hive server but since the client is disconnected >>>>>> the >>>>>> results are "lost". The complete stack trace is below. Does this sound >>>>>> familiar to anyone? >>>>>> >>>>>> org.apache.thrift.transport.TTransportException: >>>>>> java.net.SocketException: Connection reset >>>>>> at >>>>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) >>>>>> at >>>>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>>>>> at >>>>>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314) >>>>>> at >>>>>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262) >>>>>> at >>>>>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192) >>>>>> at >>>>>> org.apache.hadoop.hive.service.ThriftHive$Client.recv_execute(ThriftHive.java:72) >>>>>> at >>>>>> org.apache.hadoop.hive.service.ThriftHive$Client.execute(ThriftHive.java:57) >>>>>> at >>>>>> com.wordnik.analytics.data.ReportsRunner$.refreshReport(ReportsRunner.scala:105) >>>>>> at >>>>>> com.wordnik.analytics.data.ReportsRunner$.refreshDailyReport(ReportsRunner.scala:34) >>>>>> at >>>>>> com.wordnik.analytics.data.ReportsRunner.refreshDailyReport(ReportsRunner.scala) >>>>>> at com.wordnik.analytics.util.Temp.main(Temp.java:11) >>>>>> Caused by: java.net.SocketException: Connection reset >>>>>> at java.net.SocketInputStream.read(SocketInputStream.java:168) >>>>>> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) >>>>>> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) >>>>>> at java.io.BufferedInputStream.read(BufferedInputStream.java:317) >>>>>> at >>>>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125) >>>>>> ... 10 more >>>>>> >>>>>> -ayush >>>>>> >>>>> >>>>> >>>> >>> >> >