[ 
https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=857254&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-857254
 ]

ASF GitHub Bot logged work on HIVE-27020:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Apr/23 18:14
            Start Date: 16/Apr/23 18:14
    Worklog Time Spent: 10m 
      Work Description: SourabhBadhya commented on code in PR #4091:
URL: https://github.com/apache/hive/pull/4091#discussion_r1167991523


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java:
##########
@@ -82,7 +82,29 @@ public static ValidTxnList 
createValidTxnListForCleaner(GetOpenTxnsResponse txns
     bitSet.set(0, abortedTxns.length);
     //add ValidCleanerTxnList? - could be problematic for all the places that 
read it from
     // string as they'd have to know which object to instantiate
-    return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark, 
Long.MAX_VALUE);
+    return new ValidReadTxnList(abortedTxns, bitSet, highWatermark, 
Long.MAX_VALUE);
+  }
+
+  public static ValidTxnList 
createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long 
minOpenTxn) {
+    long highWatermark = minOpenTxn - 1;
+    long[] exceptions = new long[txns.getOpen_txnsSize()];
+    int i = 0;
+    BitSet abortedBits = BitSet.valueOf(txns.getAbortedBits());
+    // getOpen_txns() guarantees that the list contains only aborted & open 
txns.
+    // exceptions list must contain both txn types since validWriteIdList 
filters out the aborted ones and valid ones for that table.
+    // If a txn is not in exception list, it is considered as a valid one and 
thought of as an uncompacted write.
+    // See TxnHandler#getValidWriteIdsForTable() for more details.
+    for(long txnId : txns.getOpen_txns()) {

Review Comment:
   This loop is limited by the value of highWatermark. Mainly used for creating 
the exception list.



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java:
##########
@@ -82,7 +82,29 @@ public static ValidTxnList 
createValidTxnListForCleaner(GetOpenTxnsResponse txns
     bitSet.set(0, abortedTxns.length);
     //add ValidCleanerTxnList? - could be problematic for all the places that 
read it from
     // string as they'd have to know which object to instantiate
-    return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark, 
Long.MAX_VALUE);
+    return new ValidReadTxnList(abortedTxns, bitSet, highWatermark, 
Long.MAX_VALUE);
+  }
+
+  public static ValidTxnList 
createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long 
minOpenTxn) {
+    long highWatermark = minOpenTxn - 1;
+    long[] exceptions = new long[txns.getOpen_txnsSize()];
+    int i = 0;
+    BitSet abortedBits = BitSet.valueOf(txns.getAbortedBits());
+    // getOpen_txns() guarantees that the list contains only aborted & open 
txns.
+    // exceptions list must contain both txn types since validWriteIdList 
filters out the aborted ones and valid ones for that table.
+    // If a txn is not in exception list, it is considered as a valid one and 
thought of as an uncompacted write.
+    // See TxnHandler#getValidWriteIdsForTable() for more details.
+    for(long txnId : txns.getOpen_txns()) {
+      if(txnId > highWatermark) {
+        break;
+      }
+      exceptions[i] = txnId;
+      i++;
+    }
+    exceptions = Arrays.copyOf(exceptions, i);
+    //add ValidCleanerTxnList? - could be problematic for all the places that 
read it from

Review Comment:
   Removed it. Done.



##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java:
##########
@@ -82,7 +82,29 @@ public static ValidTxnList 
createValidTxnListForCleaner(GetOpenTxnsResponse txns
     bitSet.set(0, abortedTxns.length);
     //add ValidCleanerTxnList? - could be problematic for all the places that 
read it from
     // string as they'd have to know which object to instantiate
-    return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark, 
Long.MAX_VALUE);
+    return new ValidReadTxnList(abortedTxns, bitSet, highWatermark, 
Long.MAX_VALUE);
+  }
+
+  public static ValidTxnList 
createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long 
minOpenTxn) {

Review Comment:
   I have renamed `createValidTxnListForCleaner` to 
`createValidTxnListForCompactionCleaner`. This is different from 
`createValidTxnListForAbortedTxnCleaner`, mainly that we dont truncate the 
abortedBits which seems unnecessary. We are also not concerned if there are 
open txns from other tables present in this list (open txn on the same table 
will obviously be handled since highWatermark will be updated to min open for 
that table - 1). We just create an exception list based on the highWatermark 
and use it for the creating the validWriteIdList.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 857254)
    Time Spent: 13h 10m  (was: 13h)

> Implement a separate handler to handle aborted transaction cleanup
> ------------------------------------------------------------------
>
>                 Key: HIVE-27020
>                 URL: https://issues.apache.org/jira/browse/HIVE-27020
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sourabh Badhya
>            Assignee: Sourabh Badhya
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> As described in the parent task, once the cleaner is separated into different 
> entities, implement a separate handler which can create requests for aborted 
> transactions cleanup. This would move the aborted transaction cleanup 
> exclusively to the cleaner.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to