[ https://issues.apache.org/jira/browse/HIVE-16321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15975813#comment-15975813 ]
Hive QA commented on HIVE-16321: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12864061/HIVE-16321.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 10581 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=143) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge2] (batchId=167) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/4770/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/4770/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-4770/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12864061 - PreCommit-HIVE-Build > Possible deadlock in metastore with Acid enabled > ------------------------------------------------ > > Key: HIVE-16321 > URL: https://issues.apache.org/jira/browse/HIVE-16321 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 1.3.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Priority: Critical > Attachments: HIVE-16321.01.patch, HIVE-16321.02.patch > > > TxnStore.MutexAPI is a mechanism how different Metastore instances can > coordinate their operations. It uses a JDBCConnection to achieve it. > In some cases this may lead to deadlock. TxnHandler uses a connection pool > of fixed size. Suppose you have X simultaneous calls to TxnHandler.lock(), > where X is >= size of the pool. This take all connections form the pool, so > when > {noformat} > handle = getMutexAPI().acquireLock(MUTEX_KEY.CheckLock.name()); > {noformat} > is executed in _TxnHandler.checkLock(Connection dbConn, long extLockId)_ the > pool is empty and the system is deadlocked. > MutexAPI can't use the same connection as the operation it's protecting. > (TxnHandler.checkLock(Connection dbConn, long extLockId) is an example). > We could make MutexAPI use a separate connection pool (size > 'primary' conn > pool). > Or we could make TxnHandler.lock(LockRequest rqst) return immediately after > enqueueing the lock with the expectation that the caller will always follow > up with a call to checkLock(CheckLockRequest rqst). > cc [~f1sherox] -- This message was sent by Atlassian JIRA (v6.3.15#6346)