[ https://issues.apache.org/jira/browse/HIVE-25191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355944#comment-17355944 ]
Thejas Nair commented on HIVE-25191: ------------------------------------ [~mattmccline] - Thanks for working on this. I would reccomend creating two specific jiras for two issues you mention here with titles that are more self explanatory. For query execution, we do have a long poll mechanism for query execution part, so that you don't need a long persistent connection ( hive.server2.long.polling.timeout config is relevant to that) . I think there was some work done by [~vgumashta] for async query compilation as well, but that might not be complete. About the cleanup on client going away, there is already support for hive.server2.idle.session.timeout and hive.server2.idle.operation.timeout . Does that not address the use case ? > Modernize Hive Thrift CLI Service Protocol > ------------------------------------------ > > Key: HIVE-25191 > URL: https://issues.apache.org/jira/browse/HIVE-25191 > Project: Hive > Issue Type: Improvement > Reporter: Matt McCline > Assignee: Matt McCline > Priority: Major > > Unnecessary errors are occurring with the advent of proxy use such as > Gateways between the Hive client and Hive Server 2. Query failures can be due > to arbitrary proxy timeouts. This proposal avoids the timeouts by changing > the protocol to do regular polling. Currently, the Hive client uses one > request for the query compile request. Long query compile times make those > requests vulnerable to the arbitrary proxy timeouts. > Another issue is Hive Server 2 sometimes does not notice the client has > failed or has lost interest in a potentially long running query. This causes > Hive locks and Big Data query resources to be held unnecessarily. The > assumption is the client issues a cancel query request when it gets an error. > This assumption does not always hold. If the proxy returned an error itself, > that proxy may reject the subsequent cancel request, too. And, if the client > is killed or the network is down, the client cannot complete a cancel > request. The proposed solution here is for Hive Server 2 to watch that the > client is sending regular polling requests for status. If a client ceases > those requests, then Hive Server 2 will cancel the query. > Hive owns the JDBC path (i.e. HiveDriver). The ODBC path may be more > challenging because vendors provide ODBC drivers and Hive does not own the > ODBC protocol. -- This message was sent by Atlassian Jira (v8.3.4#803005)