[ https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264177#comment-14264177 ]
Jihong Liu commented on HIVE-8966: ---------------------------------- Did a test. Generally the new version works as expected. But for the following case, the compaction will always fail: 1. due to any reason, the writer exits without closing a batch. So the "length" file is still there. This could happen, for example the program is killed, hive/server restarts. 2. restart the program, so a new writer and a new batch is created and continute to write into the same partition. The data will go to a new delta. 3. Now we manually delete that "length" file in the previous delta. Then do compaction, but it fails. Even we totally exit the program so that no any open batch and no any "length" file, the compaction will never success for this partition. However the current hive 14.0 will work fine for the above case. > Delta files created by hive hcatalog streaming cannot be compacted > ------------------------------------------------------------------ > > Key: HIVE-8966 > URL: https://issues.apache.org/jira/browse/HIVE-8966 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.14.0 > Environment: hive > Reporter: Jihong Liu > Assignee: Alan Gates > Priority: Critical > Fix For: 0.14.1 > > Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.patch > > > hive hcatalog streaming will also create a file like bucket_n_flush_length in > each delta directory. Where "n" is the bucket number. But the > compactor.CompactorMR think this file also needs to compact. However this > file of course cannot be compacted, so compactor.CompactorMR will not > continue to do the compaction. > Did a test, after removed the bucket_n_flush_length file, then the "alter > table partition compact" finished successfully. If don't delete that file, > nothing will be compacted. > This is probably a very severity bug. Both 0.13 and 0.14 have this issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)