[ https://issues.apache.org/jira/browse/HIVE-22067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aditya Shah updated HIVE-22067: ------------------------------- Description: In case of an acid table, the final paths (array) of the filesink operator is populated by using bucket id as the index. This causes the final paths to have null entries when we don't write to some of the buckets. Thus, finally while committing the paths in closeOp this results in an NPE. Observed for the following query: {code:java} CREATE TABLE if not exists test_bckt_part(a int) partitioned by (b int) stored as orc; CREATE TABLE test_src_delete (a int, b int) CLUSTERED BY (b) into 5 BUCKETS; INSERT INTO TABLE test_src_delete values (1,2),(3,4),(5,2),(7,8),(9,10),(11,2),(34,53),(95,23),(1,2),(3,4),(5,2),(7,8),(9,10),(11,2),(34,53),(95,23); set tez.grouping.split-count=5; INSERT OVERWRITE TABLE test_bckt_part SELECT * FROM test_src_delete; Alter table test_bckt_part SET TBLPROPERTIES ('transactional'='true'); update test_bckt_part set a=99 where b=23; {code} was: In case of an acid table, the final paths (array) of the filesink operator is populated by using bucket id as the index. This causes the final paths to have null entries when we don't write to some of the buckets. Thus, finally while committing the paths in closeOp this results in an NPE. Observed for the following query: {code:java} CREATE TABLE if not exists test_bckt_part(a int) partitioned by (b int) stored as orc; CREATE TABLE test_src_delete (a int, b int) CLUSTERED BY (b) into 5 BUCKETS; INSERT INTO TABLE test_src_delete values (1,2),(3,4),(5,2),(7,8),(9,10),(11,2),(34,53),(95,23),(1,2),(3,4),(5,2),(7,8),(9,10),(11,2),(34,53),(95,23); set tez.grouping.split-count=5; INSERT OVERWRITE TABLE test_bckt_part SELECT * FROM test_src_delete; Alter table test_bckt_part SET TBLPROPERTIES ('transactional'='true'); update test_bckt_part set a=99 where b=23; {code} > Null pointer exception for update query on a partitioned acid table > ------------------------------------------------------------------- > > Key: HIVE-22067 > URL: https://issues.apache.org/jira/browse/HIVE-22067 > Project: Hive > Issue Type: Bug > Components: Transactions > Affects Versions: 3.1.1 > Reporter: Aditya Shah > Priority: Major > > In case of an acid table, the final paths (array) of the filesink operator is > populated by using bucket id as the index. This causes the final paths to > have null entries when we don't write to some of the buckets. Thus, finally > while committing the paths in closeOp this results in an NPE. > Observed for the following query: > {code:java} > CREATE TABLE if not exists test_bckt_part(a int) partitioned by (b int) > stored as orc; > CREATE TABLE test_src_delete (a int, b int) CLUSTERED BY (b) into 5 BUCKETS; > INSERT INTO TABLE test_src_delete values > (1,2),(3,4),(5,2),(7,8),(9,10),(11,2),(34,53),(95,23),(1,2),(3,4),(5,2),(7,8),(9,10),(11,2),(34,53),(95,23); > set tez.grouping.split-count=5; > INSERT OVERWRITE TABLE test_bckt_part SELECT * FROM test_src_delete; > Alter table test_bckt_part SET TBLPROPERTIES ('transactional'='true'); > update test_bckt_part set a=99 where b=23; > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)