[ https://issues.apache.org/jira/browse/HIVE-13330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15212643#comment-15212643 ]
Carl Steinbach commented on HIVE-13330: --------------------------------------- Please change the name of the test from "vector_string_reader_empty_dict.q" to "orc_string_reader_empty_dict.q" > ORC vectorized string dictionary reader does not differentiate null vs empty > string dictionary > ---------------------------------------------------------------------------------------------- > > Key: HIVE-13330 > URL: https://issues.apache.org/jira/browse/HIVE-13330 > Project: Hive > Issue Type: Bug > Affects Versions: 1.3.0, 2.0.0, 2.1.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Critical > Labels: CorrectnessBug > Attachments: HIVE-13330.1.patch, HIVE-13330.2.patch > > > Vectorized string dictionary reader cannot differentiate between the case > where all dictionary entries are null vs single entry with empty string. This > causes wrong results when reading data out of such files. > {code:title=Vectorization On} > SET hive.vectorized.execution.enabled=true; > SET hive.fetch.task.conversion=none; > select vcol from testnullorc3 limit 1; > OK > NULL > {code} > {code:title=Vectorization Off} > SET hive.vectorized.execution.enabled=false; > SET hive.fetch.task.conversion=none; > select vcol from testnullorc3 limit 1; > OK > {code} > The input table testnullorc3 contains a varchar column vcol with few empty > strings and few nulls. For this table, non vectorized reader returns empty as > first row but vectorized reader returns NULL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)