Peter Vary created HIVE-20183: --------------------------------- Summary: Inserting from bucketed table can cause data loss, if the source table contains empty buckets Key: HIVE-20183 URL: https://issues.apache.org/jira/browse/HIVE-20183 Project: Hive Issue Type: Bug Components: Operators Reporter: Peter Vary Assignee: Peter Vary
Could be reproduced by the following: {code} set hive.enforce.bucketing=true; set hive.enforce.sorting=true; set hive.optimize.bucketingsorting=true; create table bucket1 (id int, val string) clustered by (id) sorted by (id ASC) INTO 4 BUCKETS; insert into bucket1 values (1, 'abc'), (3, 'abc'); select * from bucket1; +-------------+--------------+ | bucket1.id | bucket1.val | +-------------+--------------+ | 3 | abc | | 1 | abc | +-------------+--------------+ create table bucket2 like bucket1; insert overwrite table bucket2 select * from bucket1; select * from bucket2; +-------------+--------------+ | bucket2.id | bucket2.val | +-------------+--------------+ | 1 | abc | +-------------+--------------+ {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)