[ 
https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-11317:
----------------------------------
    Description: 
the logic to Abort transactions that have stopped heartbeating is in
TxnHandler.timeOutTxns()
This is only called when DbTxnManger.getValidTxns() is called.
So if there is a lot of txns that need to be timed out and the there are not 
SQL clients talking to the system, there is nothing to abort dead transactions, 
and thus compaction can't clean them up so garbage accumulates in the system.

Also, streaming api doesn't call DbTxnManager at all.

Need to move this logic into Initiator (or some other metastore side thread).
Also, make sure it is broken up into multiple small(er) transactions against 
metastore DB.

Also more timeOutLocks() locks there as well.


see about adding TXNS.COMMENT field which can be used for "Auto aborted due to 
timeout" for example.

  was:
the logic to Abort transactions that have stopped heartbeating is in
TxnHandler.timeOutTxns()
This is only called when DbTxnManger.getValidTxns() is called.
So if there is a lot of txns that need to be timed out and the there are not 
SQL clients talking to the system, there is nothing to abort dead transactions, 
and thus compaction can't clean them up so garbage accumulates in the system.

Also, streaming api doesn't call DbTxnManager at all.

Need to move this logic into Initiator (or some other metastore side thread).
Also, make sure it is broken up into multiple small(er) transactions against 
metastore DB.

Also more timeOutLocks() locks there as well.



> ACID: Improve transaction Abort logic due to timeout
> ----------------------------------------------------
>
>                 Key: HIVE-11317
>                 URL: https://issues.apache.org/jira/browse/HIVE-11317
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore, Transactions
>    Affects Versions: 1.0.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>              Labels: triage
>
> the logic to Abort transactions that have stopped heartbeating is in
> TxnHandler.timeOutTxns()
> This is only called when DbTxnManger.getValidTxns() is called.
> So if there is a lot of txns that need to be timed out and the there are not 
> SQL clients talking to the system, there is nothing to abort dead 
> transactions, and thus compaction can't clean them up so garbage accumulates 
> in the system.
> Also, streaming api doesn't call DbTxnManager at all.
> Need to move this logic into Initiator (or some other metastore side thread).
> Also, make sure it is broken up into multiple small(er) transactions against 
> metastore DB.
> Also more timeOutLocks() locks there as well.
> see about adding TXNS.COMMENT field which can be used for "Auto aborted due 
> to timeout" for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to