hudi-bot opened a new issue, #15714:
URL: https://github.com/apache/hudi/issues/15714

   Currently, when a user tries to drop a partition using spark sql 
[https://spark.apache.org/docs/latest/sql-ref-syntax-ddl-alter-table.html#drop-partition]
 , and then perform a rollback on this dropped partition, they do not see this 
partition present when running  *SHOW PARTITIONS* command. The reason is that 
as part of drop partition operation, Hudi also deletes the partition from table 
metadata. However, rolling it back does not add the partition back to Hudi 
table metadata. Hence, *SHOW PARTITIONS* does not return the rolled back 
partition.
   
    
   
   As part of drop partition command, Hudi will schedule a clean operation of 
this partition data treating this a HARD delete. However, it is possible that 
user rollsback the drop partition commit by the time the cleaner is run (or may 
be user turns off the cleaner). In such scenarios, even though the data is 
rolled back, the partition still does not appear in the table metadata leaving 
the Hudi table in a corrupt state.
   
    
   
   We think we can enhance this functionality to support rollback for drop 
partitions. If we decide against it, then we should disallow rolling back of 
commits that drop partition so users don't end up in this state.
   
    
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5603
   - Type: Task


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to