[ 
https://issues.apache.org/jira/browse/HIVE-26414?focusedWorklogId=794058&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-794058
 ]

ASF GitHub Bot logged work on HIVE-26414:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jul/22 05:13
            Start Date: 22/Jul/22 05:13
    Worklog Time Spent: 10m 
      Work Description: SourabhBadhya commented on code in PR #3457:
URL: https://github.com/apache/hive/pull/3457#discussion_r927295389


##########
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:
##########
@@ -485,6 +480,26 @@ private void clearLocksAndHB() {
     stopHeartbeat();
   }
 
+  private void cleanupDirForCTAS() {

Review Comment:
   Currently `Context ctx` is not available during `rollbackTxn` which is why I 
chose to store the object.
   However I agree passing `Context ctx` is better.
   There are multiple ways of doing this - 
   First would be to involve passing `Context ctx` to `rollbackTxn` method 
which would change the HiveTxnManager API itself (I particularly dont like this 
since this would be a breaking change).
   
   Or we could create a new function in the `HiveTxnManager` interface of the 
same name and call it from the driver when rollback conditions are satisfied.
   
   My idea was to avoid both and initialise the destination in one of the 
existing APIs but I am open for any other suggestions.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 794058)
    Time Spent: 1.5h  (was: 1h 20m)

> Aborted/Cancelled CTAS operations must initiate cleanup of uncommitted data
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-26414
>                 URL: https://issues.apache.org/jira/browse/HIVE-26414
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sourabh Badhya
>            Assignee: Sourabh Badhya
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When a CTAS query fails before creation of table and after writing the data, 
> the data is present in the directory and not cleaned up currently by the 
> cleaner or any other mechanism currently. This is because the cleaner 
> requires a table corresponding to what its cleaning. In order surpass such a 
> situation, we can directly pass the relevant information to the cleaner so 
> that such uncommitted data is deleted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to