[
https://issues.apache.org/jira/browse/HIVE-29459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-29459:
----------------------------------
Labels: pull-request-available replication (was: replication)
> [DR][ACIDReplication] Add clearDanglingTxnTaskTask at the end
> -------------------------------------------------------------
>
> Key: HIVE-29459
> URL: https://issues.apache.org/jira/browse/HIVE-29459
> Project: Hive
> Issue Type: Bug
> Components: repl
> Affects Versions: 4.2.0
> Reporter: Harshal Patel
> Assignee: Harshal Patel
> Priority: Major
> Labels: pull-request-available, replication
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> Currently, at the end of replLoadTask, clearDanglingTxnTaskTask is added.
> That works in normal scenario
>
> {code:java}
> if
> (conf.getBoolVar(HiveConf.ConfVars.HIVE_REPL_CLEAR_DANGLING_TXNS_ON_TARGET))
> { ClearDanglingTxnWork clearDanglingTxnWork = new
> ClearDanglingTxnWork(work.getDumpDirectory(), targetDb.getName());
> Task<ClearDanglingTxnWork> clearDanglingTxnTaskTask =
> TaskFactory.get(clearDanglingTxnWork, conf);
> if (childTasks.isEmpty()) {
> childTasks.add(clearDanglingTxnTaskTask);
> } else {
> DAGTraversal.traverse(childTasks, new
> AddDependencyToLeaves(Collections.singletonList(clearDanglingTxnTaskTask)));
> }
> } return 0; {code}
>
> [https://github.com/apache/hive/blob/38a963540000729f0ac8e8d2ac9cd1ca22930d2a/ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplLoadTask.java#L966]
> But if the no of events for incremental load is >
> {{hive.repl.approx.max.load.tasks then Load operation can break down the
> tasks into batches of approx }}{{hive.repl.approx.max.load.tasks}}{{ (Not a
> hard limit)}}
> {{In this case, it can lead to pre-maturely cleaning of repl_txn_map and
> aborting the transaction in between the replication because
> clearDanglingTxnTaskTask gets called in between the batches rather than
> calling at the end only once per Load cycle. }}
> {{Fix:}}
> {{Add an additional check}}
> {{i.e }}
> {{}}
> {code:java}
> boolean hasPendingIncrementalWork = builder.hasMoreWork() ||
> work.hasBootstrapLoadTasks();
> if (conf.getBoolVar(HiveConf.ConfVars.HIVE_REPL_CLEAR_DANGLING_TXNS_ON_TARGET)
> && !hasPendingIncrementalWork) { {code}
> {{}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)