[ https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=719958&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-719958 ]
ASF GitHub Bot logged work on HIVE-25758: ----------------------------------------- Author: ASF GitHub Bot Created on: 03/Feb/22 09:03 Start Date: 03/Feb/22 09:03 Worklog Time Spent: 10m Work Description: asolimando commented on a change in pull request #2966: URL: https://github.com/apache/hive/pull/2966#discussion_r798348894 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveInBetweenExpandRule.java ########## @@ -138,7 +138,7 @@ public void onMatch(RelOptRuleCall call) { private final RexBuilder rexBuilder; private boolean modified; - private RexInBetweenExpander(RexBuilder rexBuilder) { + RexInBetweenExpander(RexBuilder rexBuilder) { Review comment: TLDR: Sorry, that's just a leftover from a previous attempt. Long story: At some point I attempted to expand `IN` to check the impact on simplification, but there were more side-effects than benefits, I have dropped that part but forgot to put the original visibility back. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 719958) Time Spent: 0.5h (was: 20m) > OOM due to recursive application of CBO rules > --------------------------------------------- > > Key: HIVE-25758 > URL: https://issues.apache.org/jira/browse/HIVE-25758 > Project: Hive > Issue Type: Bug > Components: CBO, Query Planning > Affects Versions: 4.0.0 > Reporter: Alessandro Solimando > Assignee: Alessandro Solimando > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > > Reproducing query is as follows: > {code:java} > create table test1 (act_nbr string); > create table test2 (month int); > create table test3 (mth int, con_usd double); > EXPLAIN > SELECT c.month, > d.con_usd > FROM > (SELECT > cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), > 'yyyy-MM-dd'), -1), 1, 7), '-', '') AS int) AS month > FROM test1 > UNION ALL > SELECT month > FROM test2 > WHERE month = 202110) c > JOIN test3 d ON c.month = d.mth; {code} > > Different plans are generated during the first CBO steps, last being: > {noformat} > 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] > parse.CalcitePlanner: Plan after decorre > lation: > HiveProject(month=[$0], con_usd=[$2]) > HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], > cost=[not available]) > HiveProject(month=[$0]) > HiveUnion(all=[true]) > > HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, > _UTF-16LE'yyyy-MM-d > d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), > _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF- > 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER]) > HiveTableScan(table=[[default, test1]], table:alias=[test1]) > HiveProject(month=[$0]) > HiveFilter(condition=[=($0, CAST(202110):INTEGER)]) > HiveTableScan(table=[[default, test2]], table:alias=[test2]) > HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat} > > Then, the HEP planner will keep expanding the filter expression with > redundant expressions, such as the following, where the identical CAST > expression is present multiple times: > > {noformat} > rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, > _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, > 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", > _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, > CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, > _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, > 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", > _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, > 202110)){noformat} > > The problem seems to come from a bad interaction of at least > _HiveFilterProjectTransposeRule_ and > {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more. > Most probably then UNION part can be removed and the reproducer be simplified > even further. > -- This message was sent by Atlassian Jira (v8.20.1#820001)