[ https://issues.apache.org/jira/browse/HIVE-9423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506729#comment-15506729 ]
Peter Vary commented on HIVE-9423: ---------------------------------- I have investigated the issue, and here is what I found: - There was an issue in the Thrift code, that if there was not enough executor, then the TTheadPoolExecutor stuck in an infinite loop see: THRIFT-2046. This issue is resolved in Thrift 0.9.2 - Hive 1.x, 2.x uses Thrift 0.9.3. I have tested the behavior on Hive 2.2.0-SNAPSHOT with the following configuration: - Add the following lines to hive-site.xml: {code} <property> <name>hive.server2.thrift.max.worker.threads</name> <value>1</value> </property> <property> <name>hive.server2.thrift.min.worker.threads</name> <value>1</value> </property> {code} - Start a metastore, and a HS2 instance - Start 2 BeeLine, and connect to the HS2 The 1st BeeLine connected as expected, the 2nd BeeLine after the configured timeout period (default 20s) printed out the following: {code} Connecting to jdbc:hive2://localhost:10000 16/09/20 16:23:57 [main]: WARN jdbc.HiveConnection: Failed to connect to localhost:10000 HS2 may be unavailable, check server status Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000: null (state=08S01,code=0) Beeline version 2.2.0-SNAPSHOT by Apache Hive beeline> {code} This is behavior is much better than the original problem (no HS2 restart is needed, and closing unused connections helps), but this is not a perfect solution, since there is no difference between a non-running HS2, and a HS2 with exhausted executor pool. > HiveServer2: Implement some admission control mechanism for graceful > degradation when resources are exhausted > ------------------------------------------------------------------------------------------------------------- > > Key: HIVE-9423 > URL: https://issues.apache.org/jira/browse/HIVE-9423 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 0.12.0, 0.13.0, 0.14.0, 0.15.0 > Reporter: Vaibhav Gumashta > > An example of where it is needed: it has been reported that when # of client > connections is greater than {{hive.server2.thrift.max.worker.threads}}, > HiveServer2 stops accepting new connections and ends up having to be > restarted. This should be handled more gracefully by the server and the JDBC > driver, so that the end user gets aware of the problem and can take > appropriate steps (either close existing connections or bump of the config > value or use multiple server instances with dynamic service discovery > enabled). Similarly, we should also review the behaviour of background thread > pool to have a well defined behavior on the the pool getting exhausted. > Ideally implementing some form of general admission control will be a better > solution, so that we do not accept new work unless sufficient resources are > available and display graceful degradation under overload. -- This message was sent by Atlassian JIRA (v6.3.4#6332)