[ https://issues.apache.org/jira/browse/HIVE-11652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715528#comment-14715528 ]
Hari Sankar Sivarama Subramaniyan commented on HIVE-11652: ---------------------------------------------------------- [~jcamachorodriguez] You may want to look at the latest patch of HIVE-11341 where a similar issue is being looked into by me. {code} // While all its descendants have not been dispatched, // we do not move forward while(!childrenDispatched) { for (Node childNode : nd.getChildren()) { walk(childNode); } childrenDispatched = getDispatchedList().containsAll(nd.getChildren()); } {code} In the above change, I see that you are using the system stack here instead of toWalk list to traverse the children nodes. However a couple of points : 1. Why do you need the 'while(!childrenDispatched)' condition at all. Wont the below code work? {code} if (!getDispatchedList().contains(childNode)) { walk(childNode). } {code} 2. we should make sure that walk() is not called again for a node already dispatched, we should make sure that this logic is present in startWalking as well {code} public void startWalking(Collection<Node> startNodes, HashMap<Node, Object> nodeOutput) throws SemanticException { for Node (nd : startNodes) { // If already dispatched list, continue. if (getDispatchedList().contains(nd)) { continue; } walk(nd); if (nodeOutput != null && getDispatchedList().contains(nd)) { nodeOutput.put(nd, retMap.get(nd)); } } } {code} In short, if you are using the system stack via recursion you can eliminate the use of toWalk data structure all together for DefaultGraphWalker. Otherwise, you can replace toWalk to be a stack instead of an arraylist. Thanks Hari > Avoid expensive call to removeAll in DefaultGraphWalker > ------------------------------------------------------- > > Key: HIVE-11652 > URL: https://issues.apache.org/jira/browse/HIVE-11652 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer, Physical Optimizer > Affects Versions: 1.3.0, 2.0.0 > Reporter: Jesus Camacho Rodriguez > Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11652.patch > > > When the plan is too large, the removeAll call in DefaultGraphWalker (line > 140) will take very long as it will have to go through the list looking for > each of the nodes. We try to get rid of this call by rewriting the logic in > the walker. -- This message was sent by Atlassian JIRA (v6.3.4#6332)