[ https://issues.apache.org/jira/browse/HIVE-27858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817252#comment-17817252 ]
Ryu Kobayashi commented on HIVE-27858: -------------------------------------- [~okumin] and I investigated and found that the root cause was combinatorial explosion of execution plans. [^query.sql] has 20 nested CTEs, and after expansion, 40,000 Operators, including 4,000 TableScanOperators, are generated. The final memory consumption is also huge, since each has a copied String, HashMap, and ColumnInfo. The same issue happens even if it reduce the number of columns. heap dump: !image-2024-02-14-17-30-59-359.png|width=482,height=213! Operators: !image-2024-02-14-17-31-32-714.png|width=488,height=21! We believe that CTE materialization is currently the only effective way. This query will succeed if we set `hive.optimize.cte.materialize.threshold` and `hive.optimize.cte.materialize.full.aggregate.only=false`. However, due to a known bug(HIVE-24167) in CTE materialization, `hive.cbo.fallback.strategy=NEVER` will fail. Therefore, we need to disable CBO or set `hive.cbo.fallback.strategy=CONSERVATIVE`. And also, re-investigate, we found that this issue also happens in Hive 2 when `hive.optimize.cte.materialize.threshold=-1`. So, it is considered not to be degraded. In summary, this query will succeed even in Hive 4 if I set the following parameters: Therefore, I think that this ticket itself can be closed without fix. What do you think? {code:java} set hive.optimize.cte.materialize.threshold=3; set hive.optimize.cte.materialize.full.aggregate.only=false; set hive.cbo.fallback.strategy=CONSERVATIVE ;{code} > OOM happens when selecting many columns and JOIN. > -------------------------------------------------- > > Key: HIVE-27858 > URL: https://issues.apache.org/jira/browse/HIVE-27858 > Project: Hive > Issue Type: Bug > Components: Query Planning > Affects Versions: 4.0.0-beta-1 > Reporter: Ryu Kobayashi > Assignee: okumin > Priority: Critical > Labels: hive-4.0.0-must > Fix For: Not Applicable > > Attachments: HIVE-27858.full.q, HIVE-27858.less.columns.q, ddl.sql, > query.sql > > > OOM happens when executing [^query.sql] using a table in [^ddl.sql]. These > did not happen in Hive 2 previously. -- This message was sent by Atlassian Jira (v8.20.10#820010)