[ 
https://issues.apache.org/jira/browse/HIVE-8850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14217027#comment-14217027
 ] 

Sushanth Sowmyan commented on HIVE-8850:
----------------------------------------

In addition to the rollback issue here, there's one more nasty interplay here 
between open and rollback - open after a rollback will simply go ahead and nuke 
out the txn state and simply increment transaction count. Thus, the following 
can now happen:

{noformat}
setConf (count = 0)
openTransaction (count = 1)
        openTransaction (count = 2)
                (... some code that fails ...)
        rollbackTransaction (count = 0)
        openTransaction (count = 1)
                (... some code that succeeds ...)
        commitTransaction (count = 0)
commitTransaction (count = -1)
{noformat}

Normally, a rollback makes it so subsequent commits will realize that it's 
rolling back, so they do nothing, but if there is ever code that tries to call 
an open again instead of going down the commit chain, we have a problem.

Again, normally, this shouldn't happen, because the rollback is indicative that 
this txn should fail, and to roll all the way out. The case where it wouldn't 
do that, however, is if we are inside a direct sql helper, where we attempt to 
use direct sql, open a txn, fail, rollback, and then fail back to using jdo, 
where we open a transaction again, and have that succeed.

Now, ordinarily, this situation is also protected against, since the directsql 
portion does not open a new transaction and roll that back if it detects that 
it's already inside a transaction. However, directsql's detection of whether or 
not it is in a nested transaction is to simply check if there is already a 
transaction which is active. This can sometimes not be the case if the 
transaction got invalidated (set inactive, effectively) by bonecp, etc. In that 
scenario, we wind up with this behaviour described above, where directsql 
decides that isInTxn is false (since the current txn is invalid), and then it 
opens a txn for the directsql part, and if that fails, will rollback.

Tagging [~sershe] to share this headache with him as well. :p

> ObjectStore:: rollbackTransaction() should set the transaction status to 
> TXN_STATUS.ROLLBACK irrespective of whether it is active or not
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-8850
>                 URL: https://issues.apache.org/jira/browse/HIVE-8850
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Hari Sankar Sivarama Subramaniyan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>         Attachments: HIVE-8850.1.patch
>
>
> We can run into issues as described below:
> Hive script adds 2800 partitions to a table and during this it can get a 
> SQLState 08S01 [Communication Link Error] and bonecp kill all the connections 
> in the pool. The partitions are added and a create table statement executes 
> (Metering_IngestedData_Compressed). The map job finishes successfully and 
> while moving the table to the hive warehouse the ObjectStore.java 
> commitTransaction() raises the error: commitTransaction was called but 
> openTransactionCalls = 0. This probably indicates that there are unbalanced 
> calls to openTransaction/commitTransaction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to