[ https://issues.apache.org/jira/browse/HIVE-22318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719814#comment-17719814 ]
zengxl commented on HIVE-22318: ------------------------------- have the same problem {code:java} MERGE INTO WH_OFR.ITV_ACTIVATE_DAY_BUCKET_TEST WDM USING WH_OFR.ITV_ACTIVATE_DAY_TEMP1_BUCKET_TEST IDM ON (WDM.PROD_ID = IDM.PROD_ID) WHEN MATCHED THEN UPDATE SET PROD_ID = IDM.PROD_ID ,PLATFORM_NAME = IDM.PLATFORM_NAME ,ACCOUNT = IDM.ACCOUNT ,ACTIVE_DATE = IDM.ACTIVE_DATE ,FILE_CYCLE = '2023-05-03' ,FILE_NBR = 1 WHEN NOT MATCHED THEN INSERT VALUES( IDM.PROD_ID ,IDM.PLATFORM_NAME ,IDM.ACCOUNT ,IDM.ACTIVE_DATE ,'2023-05-03' ,1); CREATE TABLE `WH_OFRST.ITV_ACTIVATE_DAY_BUCKET_TEST`( `prod_id` bigint, `platform_name` string, `account` string, `active_date` timestamp, `file_cycle` timestamp, `file_nbr` bigint) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'viewfs://xxx/UserData/wh_ofrst/itv_activate_day_bucket_test' TBLPROPERTIES ( 'bucketing_version'='2', 'transactional'='true', 'transactional_properties'='default', 'transient_lastDdlTime'='1679341838'); CREATE TABLE `WH_OFRST.ITV_ACTIVATE_DAY_TEMP1_BUCKET_TEST`( `prod_id` bigint, `platform_name` string, `account` string, `active_date` timestamp) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'viewfs://xxx/UserData/tmp/wh_ofrst.itv_activate_day_temp1_bucket_test' TBLPROPERTIES ( 'bucketing_version'='2', 'transactional'='true', 'transactional_properties'='default', 'transient_lastDdlTime'='1683236932') {code} exception: {code:java} Error: java.io.IOException: java.io.IOException: Two readers for {originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}: new [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00001, 9223372036854775807)], old [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00000, 9223372036854775807)] at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:420) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:702) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:176) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:445) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:350) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172) Caused by: java.io.IOException: Two readers for {originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}: new [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/ffcs_edw/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00001, 9223372036854775807)], old [key={originalWriteId: 4, 536870912(1.0.0), row: 730862, currentWriteId 8}, nextRecord={2, 4, 536870912, 730862, 8, null}, reader=Hive ORC Reader(viewfs://xxx/UserData/ffcs_edw/wh_ofr/itv_activate_day/delete_delta_0000005_0000008/bucket_00000, 9223372036854775807)] at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.ensurePutReader(OrcRawRecordMerger.java:1191) at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:1146) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:2110) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:2008) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:417) {code} > Java.io.exception:Two readers for > --------------------------------- > > Key: HIVE-22318 > URL: https://issues.apache.org/jira/browse/HIVE-22318 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2 > Affects Versions: 3.1.0 > Reporter: max_c > Priority: Major > Attachments: hiveserver2 for exception.log > > > I create a ACID table with ORC format: > > {noformat} > CREATE TABLE `some.TableA`( > .... > ) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > TBLPROPERTIES ( > 'bucketing_version'='2', > 'orc.compress'='SNAPPY', > 'transactional'='true', > 'transactional_properties'='default'){noformat} > After executing merge into operation: > {noformat} > MERGE INTO some.TableA AS a USING (SELECT vend_no FROM some.TableB UNION ALL > SELECT vend_no FROM some.TableC) AS b ON a.vend_no=b.vend_no WHEN MATCHED > THEN DELETE > {noformat} > the problem happend(when selecting the TableA, the exception happens too): > {noformat} > java.io.IOException: java.io.IOException: Two readers for {originalWriteId: > 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId 25}: new > [key={originalWriteId: 4, bucket: 536870912(1.0.0), row: 2434, currentWriteId > 25}, nextRecord={2, 4, 536870912, 2434, 25, null}, reader=Hive ORC > Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_0000015_0000026/bucket_00001, > 9223372036854775807)], old [key={originalWriteId: 4, bucket: > 536870912(1.0.0), row: 2434, currentWriteId 25}, nextRecord={2, 4, 536870912, > 2434, 25, null}, reader=Hive ORC > Reader(hdfs://hdpprod/warehouse/tablespace/managed/hive/some.db/tableA/delete_delta_0000015_0000026/bucket_00000{noformat} > Through orc_tools I scan all the > files(bucket_00000,bucket_00001,bucket_00002) under delete_delta and find all > rows of files are the same.I think this will cause the same > key(RecordIdentifer) when scan the bucket_00001 after bucket_00000 but I > don't know why all the rows are the same in these bucket files. > > -- This message was sent by Atlassian Jira (v8.20.10#820010)