Mahesh Raju Somalaraju created HIVE-27198: ---------------------------------------------
Summary: Delete directly aborted transactions instead of select and loading ids Key: HIVE-27198 URL: https://issues.apache.org/jira/browse/HIVE-27198 Project: Hive Issue Type: Improvement Reporter: Mahesh Raju Somalaraju Assignee: Mahesh Raju Somalaraju in cleaning the aborted transaction , we can directly deletes the txns instead of selecting and process. method name: cleanEmptyAbortedAndCommittedTxns Code: String s = "SELECT \"TXN_ID\" FROM \"TXNS\" WHERE " + "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " + " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + TxnStatus.COMMITTED + ") AND " + " \"TXN_ID\" < " + lowWaterMark; proposed code: String s = "DELETE \"TXN_ID\" FROM \"TXNS\" WHERE " + "\"TXN_ID\" NOT IN (SELECT \"TC_TXNID\" FROM \"TXN_COMPONENTS\") AND " + " (\"TXN_STATE\" = " + TxnStatus.ABORTED + " OR \"TXN_STATE\" = " + TxnStatus.COMMITTED + ") AND " + " \"TXN_ID\" < " + lowWaterMark; the select needs to be eliminated and the delete should work with the where clause instead of the built in clause we can see no reason for loading the ids into memory and then generate a huge sql Bathcing is also not necessary here, we can deletes the records directly -- This message was sent by Atlassian Jira (v8.20.10#820010)