Mahesh Raju Somalaraju created HIVE-27198:
---------------------------------------------

             Summary: Delete directly aborted transactions instead of select 
and loading ids
                 Key: HIVE-27198
                 URL: https://issues.apache.org/jira/browse/HIVE-27198
             Project: Hive
          Issue Type: Improvement
            Reporter: Mahesh Raju Somalaraju
            Assignee: Mahesh Raju Somalaraju


in cleaning the aborted transaction , we can directly deletes the txns instead 
of selecting and process.

method name: 

cleanEmptyAbortedAndCommittedTxns

Code:

String s = "SELECT \"TXN_ID\" FROM \"TXNS\" WHERE " +
"\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
" (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
TxnStatus.COMMITTED + ") AND "
+ " \"TXN_ID\" < " + lowWaterMark;

 

proposed code:

String s = "DELETE \"TXN_ID\" FROM \"TXNS\" WHERE " +
"\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " +
" (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + 
TxnStatus.COMMITTED + ") AND "
+ " \"TXN_ID\" < " + lowWaterMark;

 
the select needs to be eliminated and the delete should work with the where 
clause instead of the built in clause
we can see no reason for loading the ids into memory and then generate a huge 
sql
 
Bathcing is also not necessary here, we can deletes the records directly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to