Karen Coppage created HIVE-24444:
------------------------------------

             Summary: compactor.Cleaner should not set state "mark cleaned" if 
there are obsolete files in the FS
                 Key: HIVE-24444
                 URL: https://issues.apache.org/jira/browse/HIVE-24444
             Project: Hive
          Issue Type: Bug
            Reporter: Karen Coppage
            Assignee: Karen Coppage


This is an improvement on HIVE-24314, in which markCleaned() is called only if 
+any+ files are deleted by the cleaner. This could cause a problem in the 
following case:

Say for table_1 compaction1 cleaning was blocked by an open txn, and compaction 
is run again on the same table (compaction2). Both compaction1 and compaction2 
could be in "ready for cleaning" at the same time. By this time the blocking 
open txn could be committed. When the cleaner runs, one of compaction1 and 
compaction2 will remain in the "ready for cleaning" state:
Say compaction2 is picked up by the cleaner first. The Cleaner deletes all 
obsolete files.  Then compaction1 is picked up by the cleaner; the cleaner 
doesn't remove any files and compaction1 will stay in the queue in a "ready for 
cleaning" state.

HIVE-24291 already solves this issue but if it isn't usable (for example if HMS 
schema changes are out the question) then HIVE-24314 + this change will fix the 
issue of the Cleaner not removing all obsolete files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to