Light-Towers commented on issue #8427:
URL: https://github.com/apache/seatunnel/issues/8427#issuecomment-2614270615

   > Are you saying that if a field in the Hive table is an array and there are 
two records with the same array value, this bug will be triggered?
   
   Yes, if nonull records in a rowBatch is same,like:
   ```
   ['欧元'],null,['欧元']
   ```
   
   
   Here are my testing steps:
   
   1. sql:
   ```sql
   CREATE TABLE `test.orc_array_2`(
       `finance_currency` array<string>
   ) PARTITIONED BY (`dt` string) STORED AS ORC;
   
   insert into test.orc_array_2 partition (dt="2025-01-06") values 
(array('欧元')),(array('欧元'));
   ```
   2. copy file `/user/hive/warehouse/test/orc_array_2/dt=2025-01-06/000000_0` 
into the project `connector-file-base/src/test/resources/000000_0` , change 
testOrcRead()
   3. change `OrcReadStrategyTest.testOrcRead()` content "/test.orc" into 
"/000000_0",  the error will happen.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to