[ https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666323#comment-15666323 ]
Rui Li commented on HIVE-15202: ------------------------------- Prior to HIVE-13040, selecting on such a table fails with NPE in split generation. With HIVE-13040, the select returns properly. But I'm not sure if it 100% solves the problem because this isn't the original goal of HIVE-13040. The root cause is in {{CompactorOutputCommitter::commitJob}}, we 're calling rename to move output from tmp location to final location. However, if the final location already exists, i.e. computed by another compaction task, the rename will merge the two outputs, resulting the nested base dir we see. A mitigation is to delete the existing final location before the rename. But I guess it won't 100% solve the race condition here. > Concurrent compactions for the same partition may generate malformed folder > structure > ------------------------------------------------------------------------------------- > > Key: HIVE-15202 > URL: https://issues.apache.org/jira/browse/HIVE-15202 > Project: Hive > Issue Type: Bug > Reporter: Rui Li > > If two compactions run concurrently on a single partition, it may generate > folder structure like this: (nested base dir) > {noformat} > drwxr-xr-x - root supergroup 0 2016-11-14 22:23 > /user/hive/warehouse/test/z=1/base_0000007/base_0000007 > -rw-r--r-- 3 root supergroup 201 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00000 > -rw-r--r-- 3 root supergroup 611 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00001 > -rw-r--r-- 3 root supergroup 614 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00002 > -rw-r--r-- 3 root supergroup 621 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00003 > -rw-r--r-- 3 root supergroup 621 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00004 > -rw-r--r-- 3 root supergroup 201 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00005 > -rw-r--r-- 3 root supergroup 201 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00006 > -rw-r--r-- 3 root supergroup 201 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00007 > -rw-r--r-- 3 root supergroup 201 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00008 > -rw-r--r-- 3 root supergroup 201 2016-11-14 21:46 > /user/hive/warehouse/test/z=1/base_0000007/bucket_00009 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)