[ https://issues.apache.org/jira/browse/HIVE-24817?focusedWorklogId=562704&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-562704 ]
ASF GitHub Bot logged work on HIVE-24817: ----------------------------------------- Author: ASF GitHub Bot Created on: 08/Mar/21 23:21 Start Date: 08/Mar/21 23:21 Worklog Time Spent: 10m Work Description: scarlin-cloudera commented on a change in pull request #2027: URL: https://github.com/apache/hive/pull/2027#discussion_r589822395 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java ########## @@ -1007,17 +1001,12 @@ protected T getXpathOrFuncExprNodeDesc(ASTNode node, T columnDesc = children.get(0); T valueDesc = interpretNode(columnDesc, children.get(i)); if (valueDesc == null) { - if (hasNullValue) { - // Skip if null value has already been added - continue; - } - TypeInfo targetType = exprFactory.getTypeInfo(columnDesc); + // Keep original + TypeInfo targetType = exprFactory.getTypeInfo(children.get(i)); if (!expressions.containsKey(targetType)) { expressions.put(targetType, columnDesc); } - T nullConst = exprFactory.createConstantExpr(targetType, null); - expressions.put(targetType, nullConst); - hasNullValue = true; + expressions.put(targetType, children.get(i)); } else { Review comment: So I'm not sure how to address your comments, I"m gonna address all of them here. Lemme take a step back and tell you what I've already discussed with Jesus and the direction we were going. So we know we're losing some optimizations as you've noted. Jesus felt that they weren't that big of a deal. For instance, we'd lose the optimization of "tinyint_col in (2500000)". Previously, we saw this changing to false since a tinyint col can never be that value, but that check won't be optimized out now. I think that's the main one I saw with the tests. So for now, we'd like to avoid a bigger rewrite. He also noted that this optimization should perhaps be more in the Calcite framework, which makes sense to me. You did have one other comment about an "if" statement not being hit. I'm not sure I understand the bug that you're referring to. It's a bit complicated to understand, but it seems ok to me? Can you explain this further? Thanks again! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 562704) Time Spent: 40m (was: 0.5h) > "not in" clause returns incorrect data when there is coercion > ------------------------------------------------------------- > > Key: HIVE-24817 > URL: https://issues.apache.org/jira/browse/HIVE-24817 > Project: Hive > Issue Type: Bug > Components: CBO > Reporter: Steve Carlin > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > When the query has a where clause that has an integer column checking against > being "not in" a decimal column, the decimal column is being changed to null, > causing incorrect results. > This is a sample query of a failure: > select count(*) from my_tbl where int_col not in (355.8); > Since the int_col can never be 355.8, one would expect all the rows to be > returned, but it is changing the 355.8 into a null value causing no rows to > be returned. -- This message was sent by Atlassian Jira (v8.3.4#803005)