[ https://issues.apache.org/jira/browse/HIVE-26320?focusedWorklogId=813740&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-813740 ]
ASF GitHub Bot logged work on HIVE-26320: ----------------------------------------- Author: ASF GitHub Bot Created on: 30/Sep/22 15:31 Start Date: 30/Sep/22 15:31 Worklog Time Spent: 10m Work Description: jfsii commented on code in PR #3628: URL: https://github.com/apache/hive/pull/3628#discussion_r984714606 ########## ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java: ########## @@ -481,12 +485,36 @@ protected BytesWritable convert(Binary binary) { }, ESTRING_CONVERTER(String.class) { @Override - PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent, TypeInfo hiveTypeInfo) { + PrimitiveConverter getConverter(final PrimitiveType type, final int index, final ConverterParent parent, + TypeInfo hiveTypeInfo) { + // If we have type information, we should return properly typed strings. However, there are a variety + // of code paths that do not provide the typeInfo in those cases we default to Text. This idiom is also + // followed by for example the BigDecimal converter in which if there is no type information, + // it defaults to the widest representation + if (hiveTypeInfo != null) { + String typeName = hiveTypeInfo.getTypeName().toLowerCase(); + if (typeName.startsWith(serdeConstants.CHAR_TYPE_NAME)) { + return new BinaryConverter<HiveCharWritable>(type, parent, index) { + @Override + protected HiveCharWritable convert(Binary binary) { + return new HiveCharWritable(binary.getBytes(), ((CharTypeInfo) hiveTypeInfo).getLength()); + } + }; + } else if (typeName.startsWith(serdeConstants.VARCHAR_TYPE_NAME)) { + return new BinaryConverter<HiveVarcharWritable>(type, parent, index) { + @Override + protected HiveVarcharWritable convert(Binary binary) { + return new HiveVarcharWritable(binary.getBytes(), ((VarcharTypeInfo) hiveTypeInfo).getLength()); + } + }; + } + } Review Comment: Yeah, I did follow the file on its convention. However, I changed it to your suggestion because it looks cleaner and might as well encourage this still in case someone copies me Issue Time Tracking ------------------- Worklog Id: (was: 813740) Time Spent: 3.5h (was: 3h 20m) > Incorrect case evaluation for Parquet based table > ------------------------------------------------- > > Key: HIVE-26320 > URL: https://issues.apache.org/jira/browse/HIVE-26320 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Query Planning > Affects Versions: 4.0.0-alpha-1 > Reporter: Chiran Ravani > Assignee: John Sherman > Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > Query involving case statement with two or more conditions leads to incorrect > result for tables with parquet format, The problem is not observed with ORC > or TextFile. > *Steps to reproduce*: > {code:java} > create external table case_test_parquet(kob varchar(2),enhanced_type_code > int) stored as parquet; > insert into case_test_parquet values('BB',18),('BC',18),('AB',18); > select case when ( > (kob='BB' and enhanced_type_code='18') > or (kob='BC' and enhanced_type_code='18') > ) > then 1 > else 0 > end as logic_check > from case_test_parquet; > {code} > Result: > {code} > 0 > 0 > 0 > {code} > Expected result: > {code} > 1 > 1 > 0 > {code} > The problem does not appear when setting hive.optimize.point.lookup=false. -- This message was sent by Atlassian Jira (v8.20.10#820010)