[ 
https://issues.apache.org/jira/browse/HIVE-24031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176322#comment-17176322
 ] 

Stamatis Zampetakis commented on HIVE-24031:
--------------------------------------------

I run the query  from above with {{TestMiniLlapLocalCliDriver}} and the 
profiling ([^query_big_array_constructor.nps])  shows that the vast majority of 
time is spend on creating defensive copies of the node expression list inside 
ASTNode#getChildren. 

 !ASTNode_getChildren_cost.png! 

The method is called extensively from various places in the code especially 
those walking over the expression tree so it needs to be efficient. I propose 
to drop the defensive copy (possibly protecting the list from modifications via 
an unmodiafable collection) and let clients do copies of the list if they deem 
necessary. In most of the cases, if not all, making copies of the list seems 
useless.

> Infinite planning time on syntactically big queries
> ---------------------------------------------------
>
>                 Key: HIVE-24031
>                 URL: https://issues.apache.org/jira/browse/HIVE-24031
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>             Fix For: 4.0.0
>
>         Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to