[ https://issues.apache.org/jira/browse/HIVE-28598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
yongzhi.shao updated HIVE-28598: -------------------------------- Description: Currently, we have found that in some scenarios, join operations using two iceberg tables may result in NPEs. INIT-SQL: {code:java} CREATE TABLE T1 ( ID STRING, ID2 STRING, ID3 STRING )STORED BY ICEBERG STORED AS ORC; CREATE TABLE T2 ( ID STRING, ID2 STRING )STORED BY ICEBERG STORED AS ORC; CREATE TABLE T1_ORC ( ID STRING, ID2 STRING, ID3 STRING )STORED AS ORC; CREATE TABLE T2_ORC ( ID STRING, ID2 STRING )STORED AS ORC; {code} 1. When the bucket_version of the T1 table is different from that of the T2 table, running the SQL shown below will throw an error: {code:java} select count(1) from (select ID,ID2,ID3 from test.t1) t left join (select ID,ID2 from test.t2) t2 on t.ID = t2.ID and t.ID2 = t2.ID2; {code} 2.When the BUCKET_VERSION of the T1 and T2 tables are the same, problem 1 disappears, but the following SQL still throws an exception: {code:java} select count(1) from (select ID,ID2 from test.t1 WHERE ID3='NORMAL') t left join (select ID,ID2 from test.t2) t2 on t.ID = t2.ID and t.ID2 = t2.ID2; {code} When I replace the T1 T2 table with the T1_ORC T2_ORC table, the SQL executes fine. was:Currently, we have found that in some scenarios, join operations using two iceberg tables may result in NPEs. > Join by using two iceberg table may cause NPE > --------------------------------------------- > > Key: HIVE-28598 > URL: https://issues.apache.org/jira/browse/HIVE-28598 > Project: Hive > Issue Type: Bug > Security Level: Public(Viewable by anyone) > Components: Iceberg integration, Query Processor > Affects Versions: 4.0.1 > Reporter: yongzhi.shao > Priority: Major > > Currently, we have found that in some scenarios, join operations using two > iceberg tables may result in NPEs. > INIT-SQL: > > {code:java} > CREATE TABLE T1 > ( > ID STRING, > ID2 STRING, > ID3 STRING > )STORED BY ICEBERG STORED AS ORC; > CREATE TABLE T2 > ( > ID STRING, > ID2 STRING > )STORED BY ICEBERG STORED AS ORC; > CREATE TABLE T1_ORC > ( > ID STRING, > ID2 STRING, > ID3 STRING > )STORED AS ORC; > CREATE TABLE T2_ORC > ( > ID STRING, > ID2 STRING > )STORED AS ORC; {code} > > > 1. When the bucket_version of the T1 table is different from that of the T2 > table, running the SQL shown below will throw an error: > > {code:java} > select count(1) > from > (select ID,ID2,ID3 from test.t1) t > left join > (select ID,ID2 from test.t2) t2 > on t.ID = t2.ID and t.ID2 = t2.ID2; {code} > 2.When the BUCKET_VERSION of the T1 and T2 tables are the same, problem 1 > disappears, but the following SQL still throws an exception: > {code:java} > select count(1) > from > (select ID,ID2 from test.t1 WHERE ID3='NORMAL') t > left join > (select ID,ID2 from test.t2) t2 > on t.ID = t2.ID and t.ID2 = t2.ID2; {code} > When I replace the T1 T2 table with the T1_ORC T2_ORC table, the SQL executes > fine. > -- This message was sent by Atlassian Jira (v8.20.10#820010)