Alexandre Linte created HIVE-14631: -------------------------------------- Summary: HiveServer2 regularly fails to connect to metastore Key: HIVE-14631 URL: https://issues.apache.org/jira/browse/HIVE-14631 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 2.1.0, 2.0.0, 1.2.1 Environment: Hive 2.1.0, Hue 3.10.0, Hadoop 2.7.2, Tez 0.8.3 Reporter: Alexandre Linte
I have a cluster secured with Kerberos and Hive is configured to work with Tez by default. Everything works well through hive-cli and beeline; however, I'm facing a strange behavior through Hue. I can have a lot of client connections (these can reach 600) and after a day, the client connections fail. But this is not the case for all clients connection attempts. When it fails, I have the following logs on the HiveServer2: {noformat} Aug 3 09:28:04 hiveserver2.bigdata.fr Executing command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112): INSERT INTO TABLE shfs3453.camille_test VALUES ('coucou') Aug 3 09:28:04 hiveserver2.bigdata.fr Query ID = hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112 Aug 3 09:28:04 hiveserver2.bigdata.fr Total jobs = 1 Aug 3 09:28:04 hiveserver2.bigdata.fr Launching Job 1 out of 1 Aug 3 09:28:04 hiveserver2.bigdata.fr Starting task [Stage-1:MAPRED] in parallel Aug 3 09:28:04 hiveserver2.bigdata.fr Trying to connect to metastore with URI thrift://metastore01.bigdata.fr:9083 Aug 3 09:28:04 hiveserver2.bigdata.fr Failed to connect to the MetaStore Server... Aug 3 09:28:04 hiveserver2.bigdata.fr Waiting 1 seconds before next connection attempt. Aug 3 09:28:05 hiveserver2.bigdata.fr Trying to connect to metastore with URI thrift://metastore01.bigdata.fr:9083 Aug 3 09:28:05 hiveserver2.bigdata.fr Failed to connect to the MetaStore Server... Aug 3 09:28:05 hiveserver2.bigdata.fr Waiting 1 seconds before next connection attempt. Aug 3 09:28:06 hiveserver2.bigdata.fr Trying to connect to metastore with URI thrift://metastore01.bigdata.fr:9083 Aug 3 09:28:06 hiveserver2.bigdata.fr Failed to connect to the MetaStore Server... Aug 3 09:28:06 hiveserver2.bigdata.fr Waiting 1 seconds before next connection attempt. Aug 3 09:28:08 hiveserver2.bigdata.fr FAILED: Execution Error, return code -1 from org.apache.hadoop.hive.ql.exec.tez.TezTask Aug 3 09:28:08 hiveserver2.bigdata.fr Completed executing command(queryId=hiveserver2_20160803092803_a216edf1-bb51-43a7-81a6-f40f1574b112); Time taken: 4.002 seconds {noformat} At the same time I have the following logs on the Metastore are: {noformat} Aug 3 09:28:03 metastore01.bigdata.fr 180: get_table : db=shfs3453 tbl=camille_test Aug 3 09:28:03 metastore01.bigdata.fr ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 tbl=camille_test#011 Aug 3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 tbl=camille_test Aug 3 09:28:04 metastore01.bigdata.fr ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 tbl=camille_test#011 Aug 3 09:28:04 metastore01.bigdata.fr 180: get_table : db=shfs3453 tbl=camille_test Aug 3 09:28:04 metastore01.bigdata.fr ugi=shfs3453#011ip=10.77.64.228#011cmd=get_table : db=shfs3453 tbl=camille_test#011 Aug 3 09:28:04 metastore01.bigdata.fr SASL negotiation failure Aug 3 09:28:04 metastore01.bigdata.fr Error occurred during processing of message. Aug 3 09:28:05 metastore01.bigdata.fr SASL negotiation failure Aug 3 09:28:05 metastore01.bigdata.fr Error occurred during processing of message. Aug 3 09:28:06 metastore01.bigdata.fr SASL negotiation failure Aug 3 09:28:06 metastore01.bigdata.fr Error occurred during processing of message. {noformat} Note: I also created a JIRA for Hue: -- This message was sent by Atlassian JIRA (v6.3.4#6332)