[ https://issues.apache.org/jira/browse/HIVE-23560?focusedWorklogId=459627&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-459627 ]
ASF GitHub Bot logged work on HIVE-23560: ----------------------------------------- Author: ASF GitHub Bot Created on: 16/Jul/20 03:37 Start Date: 16/Jul/20 03:37 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1232: URL: https://github.com/apache/hive/pull/1232#discussion_r455490260 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ########## @@ -999,20 +1033,27 @@ String getValidTxnListForReplDump(Hive hiveDb, long waitUntilTime) throws HiveEx } catch (InterruptedException e) { LOG.info("REPL DUMP thread sleep interrupted", e); } - validTxnList = getTxnMgr().getValidTxns(); + validTxnList = getTxnMgr().getValidTxns(Arrays.asList(TxnType.READ_ONLY, TxnType.REPL_CREATED)); } // After the timeout just force abort the open txns - List<Long> openTxns = getOpenTxns(validTxnList); - if (!openTxns.isEmpty()) { - hiveDb.abortTransactions(openTxns); - validTxnList = getTxnMgr().getValidTxns(); - if (validTxnList.getMinOpenTxn() != null) { - openTxns = getOpenTxns(validTxnList); - LOG.warn("REPL DUMP unable to force abort all the open txns: {} after timeout due to unknown reasons. " + - "However, this is rare case that shouldn't happen.", openTxns); - throw new IllegalStateException("REPL DUMP triggered abort txns failed for unknown reasons."); + if (conf.getBoolVar(REPL_BOOTSTRAP_DUMP_ABORT_WRITE_TXN_AFTER_TIMEOUT)) { + List<Long> openTxns = getOpenTxns(validTxnList, work.dbNameOrPattern); + if (!openTxns.isEmpty()) { + //abort only write transactions for the db under replication if abort transactions is enabled. + hiveDb.abortTransactions(openTxns); + validTxnList = getTxnMgr().getValidTxns(Arrays.asList(TxnType.READ_ONLY, TxnType.REPL_CREATED)); Review comment: If we use the already obtained validTxnList we won't know if there are still open txns. This is to check all open txns that were previously open, are aborted and not part of invalid txn list again. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 459627) Time Spent: 1h (was: 50m) > Optimize bootstrap dump to abort only write Transactions > -------------------------------------------------------- > > Key: HIVE-23560 > URL: https://issues.apache.org/jira/browse/HIVE-23560 > Project: Hive > Issue Type: Task > Reporter: Aasha Medhi > Assignee: Aasha Medhi > Priority: Major > Labels: pull-request-available > Attachments: HIVE-23560.01.patch, HIVE-23560.02.patch, Optimize > bootstrap dump to avoid aborting all transactions.pdf > > Time Spent: 1h > Remaining Estimate: 0h > > Currently before doing a bootstrap dump, we abort all open transactions after > waiting for a configured time. We are proposing to abort only write > transactions for the db under replication and leave the read and repl created > transactions as is. > This doc attached talks about it in detail -- This message was sent by Atlassian Jira (v8.3.4#803005)