[ https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=857254&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-857254 ]
ASF GitHub Bot logged work on HIVE-27020: ----------------------------------------- Author: ASF GitHub Bot Created on: 16/Apr/23 18:14 Start Date: 16/Apr/23 18:14 Worklog Time Spent: 10m Work Description: SourabhBadhya commented on code in PR #4091: URL: https://github.com/apache/hive/pull/4091#discussion_r1167991523 ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java: ########## @@ -82,7 +82,29 @@ public static ValidTxnList createValidTxnListForCleaner(GetOpenTxnsResponse txns bitSet.set(0, abortedTxns.length); //add ValidCleanerTxnList? - could be problematic for all the places that read it from // string as they'd have to know which object to instantiate - return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark, Long.MAX_VALUE); + return new ValidReadTxnList(abortedTxns, bitSet, highWatermark, Long.MAX_VALUE); + } + + public static ValidTxnList createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long minOpenTxn) { + long highWatermark = minOpenTxn - 1; + long[] exceptions = new long[txns.getOpen_txnsSize()]; + int i = 0; + BitSet abortedBits = BitSet.valueOf(txns.getAbortedBits()); + // getOpen_txns() guarantees that the list contains only aborted & open txns. + // exceptions list must contain both txn types since validWriteIdList filters out the aborted ones and valid ones for that table. + // If a txn is not in exception list, it is considered as a valid one and thought of as an uncompacted write. + // See TxnHandler#getValidWriteIdsForTable() for more details. + for(long txnId : txns.getOpen_txns()) { Review Comment: This loop is limited by the value of highWatermark. Mainly used for creating the exception list. ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java: ########## @@ -82,7 +82,29 @@ public static ValidTxnList createValidTxnListForCleaner(GetOpenTxnsResponse txns bitSet.set(0, abortedTxns.length); //add ValidCleanerTxnList? - could be problematic for all the places that read it from // string as they'd have to know which object to instantiate - return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark, Long.MAX_VALUE); + return new ValidReadTxnList(abortedTxns, bitSet, highWatermark, Long.MAX_VALUE); + } + + public static ValidTxnList createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long minOpenTxn) { + long highWatermark = minOpenTxn - 1; + long[] exceptions = new long[txns.getOpen_txnsSize()]; + int i = 0; + BitSet abortedBits = BitSet.valueOf(txns.getAbortedBits()); + // getOpen_txns() guarantees that the list contains only aborted & open txns. + // exceptions list must contain both txn types since validWriteIdList filters out the aborted ones and valid ones for that table. + // If a txn is not in exception list, it is considered as a valid one and thought of as an uncompacted write. + // See TxnHandler#getValidWriteIdsForTable() for more details. + for(long txnId : txns.getOpen_txns()) { + if(txnId > highWatermark) { + break; + } + exceptions[i] = txnId; + i++; + } + exceptions = Arrays.copyOf(exceptions, i); + //add ValidCleanerTxnList? - could be problematic for all the places that read it from Review Comment: Removed it. Done. ########## standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java: ########## @@ -82,7 +82,29 @@ public static ValidTxnList createValidTxnListForCleaner(GetOpenTxnsResponse txns bitSet.set(0, abortedTxns.length); //add ValidCleanerTxnList? - could be problematic for all the places that read it from // string as they'd have to know which object to instantiate - return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark, Long.MAX_VALUE); + return new ValidReadTxnList(abortedTxns, bitSet, highWatermark, Long.MAX_VALUE); + } + + public static ValidTxnList createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long minOpenTxn) { Review Comment: I have renamed `createValidTxnListForCleaner` to `createValidTxnListForCompactionCleaner`. This is different from `createValidTxnListForAbortedTxnCleaner`, mainly that we dont truncate the abortedBits which seems unnecessary. We are also not concerned if there are open txns from other tables present in this list (open txn on the same table will obviously be handled since highWatermark will be updated to min open for that table - 1). We just create an exception list based on the highWatermark and use it for the creating the validWriteIdList. Issue Time Tracking ------------------- Worklog Id: (was: 857254) Time Spent: 13h 10m (was: 13h) > Implement a separate handler to handle aborted transaction cleanup > ------------------------------------------------------------------ > > Key: HIVE-27020 > URL: https://issues.apache.org/jira/browse/HIVE-27020 > Project: Hive > Issue Type: Sub-task > Reporter: Sourabh Badhya > Assignee: Sourabh Badhya > Priority: Major > Labels: pull-request-available > Time Spent: 13h 10m > Remaining Estimate: 0h > > As described in the parent task, once the cleaner is separated into different > entities, implement a separate handler which can create requests for aborted > transactions cleanup. This would move the aborted transaction cleanup > exclusively to the cleaner. -- This message was sent by Atlassian Jira (v8.20.10#820010)