[ https://issues.apache.org/jira/browse/HIVE-28120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ayush Saxena resolved HIVE-28120. --------------------------------- Fix Version/s: Not Applicable Resolution: Cannot Reproduce > When insert overwrite the iceberg table, data will loss if the sql contains > union all > ------------------------------------------------------------------------------------- > > Key: HIVE-28120 > URL: https://issues.apache.org/jira/browse/HIVE-28120 > Project: Hive > Issue Type: Bug > Components: Iceberg integration > Affects Versions: 4.0.0-beta-1 > Environment: hadoop version: 3.3.1 > hive version: 4.0.0-beta-1 > iceberg version: 1.3.0 > Reporter: xinmingchang > Priority: Major > Fix For: Not Applicable > > > {{(1)}} > create table tmp.test_iceberg_overwrite_union_all( > a string > ) > stored by iceberg > ; > {{(2)}} > insert overwrite table tmp.test_iceberg_overwrite_union_all > select distinct 'a' union all select distinct 'b'; > {{(3)}} > select * from tmp.test_iceberg_overwrite_union_all; > > the result only has one record: > +-------------------------------------+ > | test_iceberg_overwrite_union_all.a | > +-------------------------------------+ > | a | > +-------------------------------------+ > According to the hiveserver log, this query will start two jobs, and each job > will be committed. The problem is that the job that is committed later is > also an overwrite, causing the result of the first commit to be overwritten. > like this: > 2024-03-05T22:10:12,995 INFO [iceberg-commit-table-pool-0]: > hive.HiveIcebergOutputCommitter () - Committing job has started for table: > default_iceberg.tmp.test_iceberg_overwrite_union_all > 2024-03-05T22:10:13,081 INFO [iceberg-commit-table-pool-1]: > hive.HiveIcebergOutputCommitter () - Committing job has started for table: > default_iceberg.tmp.test_iceberg_overwrite_union_all > 2024-03-05T22:10:15,152 INFO [iceberg-commit-table-pool-0]: > hive.HiveIcebergOutputCommitter () - Overwrite commit took 2157 ms for table: > default_iceberg.tmp.test_iceberg_overwrite_union_all with 1 file(s) > 2024-03-05T22:10:16,980 INFO [iceberg-commit-table-pool-1]: > hive.HiveIcebergOutputCommitter () - Overwrite commit took 3899 ms for table: > default_iceberg.tmp.test_iceberg_overwrite_union_all with 1 file(s) -- This message was sent by Atlassian Jira (v8.20.10#820010)