Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/22408 )
Change subject: IMPALA-13520: Support in clause coercing ...................................................................... Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/22408/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/22408/7//COMMIT_MSG@11 PS7, Line 11: we do have nit: By 'we' I suppose you mean the Impala planner ? It wasn't clear from the description. http://gerrit.cloudera.org:8080/#/c/22408/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/CombineValueNodesRule.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/CombineValueNodesRule.java: http://gerrit.cloudera.org:8080/#/c/22408/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/CombineValueNodesRule.java@41 PS7, Line 41: * This is needed because there is a test case in test_exprs.py which creates : * an IN expression with 10000 literals in the IN clause. Calcite will create : * 10000 RelNodes for this which becomes incredibly slow at execution time. : * I think it is better to keep the comment general enough and not point to a specific test file name. The location of such tests could change in the future and the comment will be out of sync. Also, this could very well be a real user scenario, not just a test case that is being targeted. http://gerrit.cloudera.org:8080/#/c/22408/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/rules/CombineValueNodesRule.java@48 PS7, Line 48: CombineValueNodesRule nit: can we name this CombineValuesNodesRule (with the plural 'Values' since the actual naming is LogicalValues. ) This is a minor point. Up to you. http://gerrit.cloudera.org:8080/#/c/22408/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeCoercionImpl.java File java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeCoercionImpl.java: http://gerrit.cloudera.org:8080/#/c/22408/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeCoercionImpl.java@80 PS7, Line 80: // Only handle IN clause with a list as its second param Is a NOT IN list handled differently ? For the same example where we have a large NOT IN list of constants where the types could be heterogenous, is that transformed in a different way ? http://gerrit.cloudera.org:8080/#/c/22408/7/java/calcite-planner/src/main/java/org/apache/impala/calcite/type/ImpalaTypeCoercionImpl.java@93 PS7, Line 93: for (SqlNode node : inList) { : rightOperandTypes.add(deriveType(validator, scope, node)); : } : I think we only need to store the unique operand types into the rightOperandTypes list. So, for the original test case which has 10K items, suppose most are Integers and one is a Decimal, we only need to keep track of 2 types in the rightOperandTypes. One option is to use a Set. -- To view, visit http://gerrit.cloudera.org:8080/22408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I492845d623766b9182bca5eeca22eb3352ef2f3d Gerrit-Change-Number: 22408 Gerrit-PatchSet: 7 Gerrit-Owner: Steve Carlin <scar...@cloudera.com> Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com> Gerrit-Reviewer: Fang-Yu Rao <fangyu....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Steve Carlin <scar...@cloudera.com> Gerrit-Comment-Date: Thu, 30 Jan 2025 01:22:48 +0000 Gerrit-HasComments: Yes