Light-Towers commented on issue #8427: URL: https://github.com/apache/seatunnel/issues/8427#issuecomment-2614270615
> Are you saying that if a field in the Hive table is an array and there are two records with the same array value, this bug will be triggered? Yes, if nonull records in a rowBatch is same,like: ``` ['欧元'],null,['欧元'] ``` Here are my testing steps: 1. sql: ```sql CREATE TABLE `test.orc_array_2`( `finance_currency` array<string> ) PARTITIONED BY (`dt` string) STORED AS ORC; insert into test.orc_array_2 partition (dt="2025-01-06") values (array('欧元')),(array('欧元')); ``` 2. copy file `/user/hive/warehouse/test/orc_array_2/dt=2025-01-06/000000_0` into the project `connector-file-base/src/test/resources/000000_0` , change testOrcRead() 3. change `OrcReadStrategyTest.testOrcRead()` content "/test.orc" into "/000000_0", the error will happen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@seatunnel.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org