[ https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raghav Aggarwal updated HIVE-27662: ----------------------------------- Description: When reading a text table with vectorization on and hive.fetch.task.conversion as none, wrong parsing of delimiter is happening in nested complex types containing map. For example, if a columns schema is like: map<string,struct<id:string,name:string> then \u0004 char is coming in the output. Here is a example: Sample q file: {code:java} set hive.fetch.task.conversion=none; set hive.vectorized.execution.enabled=true; create EXTERNAL table `table4` as select 'bob' as name, map( "Map_Key1", named_struct( 'Id', 'Id_Value1', 'Name', 'Name_Value1' ), "Map_Key2", named_struct( 'Id', 'Id_Value2', 'Name', 'Name_Value2' ) ) as testmarks; select * from table4; set hive.vectorized.execution.enabled=false; select * from table4; {code} Output of 1st select statement: {code:java} bob· {"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code} Output of 2nd select statement: {code:java} bob· {"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code} MAP Complex type is not handling the scenario where it contains a nested complex type like STRUCT, ARRAY, UNION. *To reproduce this issue:* *mvn test -Dtest=TestCliDriver -Pitests -Dqfile=`qfile_name`-pl itests/qtest -Dtest.output.overwrite* was: When reading a text table with vectorization on and hive.fetch.task.conversion as none, wrong parsing of delimiter is happening in nested complex types containing map. For example, if a columns schema is like: map<string,struct<id:string,name:string> then \u0004 char is coming in the output. Here is a example: Sample q file: {code:java} set hive.fetch.task.conversion=none; set hive.vectorized.execution.enabled=true; create EXTERNAL table `table4` as select 'bob' as name, map( "Map_Key1", named_struct( 'Id', 'Id_Value1', 'Name', 'Name_Value1' ), "Map_Key2", named_struct( 'Id', 'Id_Value2', 'Name', 'Name_Value2' ) ) as testmarks; select * from table4; set hive.vectorized.execution.enabled=false; select * from table4; {code} Output of 1st select statement: {code:java} bob· {"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code} Output of 2nd select statement: {code:java} bob· {"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code} MAP Complex type is not handling the scenario where it contains a nested complex type like STRUCT, ARRAY, UNION. > Incorrect parsing of nested complex types containing map during vectorized > text processing > ------------------------------------------------------------------------------------------ > > Key: HIVE-27662 > URL: https://issues.apache.org/jira/browse/HIVE-27662 > Project: Hive > Issue Type: Bug > Components: Vectorization > Reporter: Raghav Aggarwal > Assignee: Raghav Aggarwal > Priority: Major > > When reading a text table with vectorization on and > hive.fetch.task.conversion as none, wrong parsing of delimiter is happening > in nested complex types containing map. For example, if a columns schema is > like: map<string,struct<id:string,name:string> then \u0004 char is coming in > the output. Here is a example: > > Sample q file: > > {code:java} > set hive.fetch.task.conversion=none; > set hive.vectorized.execution.enabled=true; > create EXTERNAL table `table4` as > select > 'bob' as name, > map( > "Map_Key1", > named_struct( > 'Id', > 'Id_Value1', > 'Name', > 'Name_Value1' > ), > "Map_Key2", > named_struct( > 'Id', > 'Id_Value2', > 'Name', > 'Name_Value2' > ) > ) as testmarks; > select * from table4; > set hive.vectorized.execution.enabled=false; > select * from table4; > {code} > Output of 1st select statement: > {code:java} > bob· > {"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code} > Output of 2nd select statement: > {code:java} > bob· > {"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code} > > MAP Complex type is not handling the scenario where it contains a nested > complex type like STRUCT, ARRAY, UNION. > > *To reproduce this issue:* > *mvn test -Dtest=TestCliDriver -Pitests -Dqfile=`qfile_name`-pl itests/qtest > -Dtest.output.overwrite* -- This message was sent by Atlassian Jira (v8.20.10#820010)