Richard Williams created HIVE-10410:
---------------------------------------

             Summary: Apparent race condition in HiveServer2 causing 
intermittent query failures
                 Key: HIVE-10410
                 URL: https://issues.apache.org/jira/browse/HIVE-10410
             Project: Hive
          Issue Type: Bug
          Components: HiveServer2
    Affects Versions: 0.13.1
         Environment: CDH 5.3.3
CentOS 6.5
            Reporter: Richard Williams


On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC 
occasionally trigger odd Thrift exceptions with messages such as "Read a 
negative frame size (-2147418110)!" or "out of sequence response" in 
HiveServer2's connections to the metastore. For certain metastore calls (for 
example, showDatabases), these Thrift exceptions are converted to 
MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient 
from retrying these calls and thus causes the failure to bubble out to the JDBC 
client.

Note that as far as we can tell, this issue appears to only affect queries that 
are submitted with the runAsync flag on TExecuteStatementReq set to true 
(which, in practice, seems to mean all JDBC queries), and it appears to only 
manifest when HiveServer2 is using the new HTTP transport mechanism. When both 
these conditions hold, we are able to fairly reliably reproduce the issue by 
spawning about 100 simple, concurrent hive queries (we have been using "show 
databases"), two or three of which typically fail. However, when either of 
these conditions do not hold, we are no longer able to reproduce the issue.

Some example stack traces from the HiveServer2 logs:




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to