[ https://issues.apache.org/jira/browse/HIVE-24031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176322#comment-17176322 ]
Stamatis Zampetakis commented on HIVE-24031: -------------------------------------------- I run the query from above with {{TestMiniLlapLocalCliDriver}} and the profiling ([^query_big_array_constructor.nps]) shows that the vast majority of time is spend on creating defensive copies of the node expression list inside ASTNode#getChildren. !ASTNode_getChildren_cost.png! The method is called extensively from various places in the code especially those walking over the expression tree so it needs to be efficient. I propose to drop the defensive copy (possibly protecting the list from modifications via an unmodiafable collection) and let clients do copies of the list if they deem necessary. In most of the cases, if not all, making copies of the list seems useless. > Infinite planning time on syntactically big queries > --------------------------------------------------- > > Key: HIVE-24031 > URL: https://issues.apache.org/jira/browse/HIVE-24031 > Project: Hive > Issue Type: Bug > Components: Query Planning > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Fix For: 4.0.0 > > Attachments: ASTNode_getChildren_cost.png, > query_big_array_constructor.nps > > > Syntactically big queries (~1 million tokens), such as the query shown below, > lead to very big (seemingly infinite) planning times. > {code:sql} > select posexplode(array('item1', 'item2', ..., 'item1M')); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)