[ https://issues.apache.org/jira/browse/HIVE-25941?focusedWorklogId=755214&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-755214 ]
ASF GitHub Bot logged work on HIVE-25941: ----------------------------------------- Author: ASF GitHub Bot Created on: 11/Apr/22 13:06 Start Date: 11/Apr/22 13:06 Worklog Time Spent: 10m Work Description: kasakrisz commented on code in PR #3014: URL: https://github.com/apache/hive/pull/3014#discussion_r847305506 ########## ql/src/java/org/apache/hadoop/hive/ql/metadata/MaterializedViewsCache.java: ########## @@ -205,4 +212,52 @@ HiveRelOptMaterialization get(String dbName, String viewName) { public boolean isEmpty() { return materializedViews.isEmpty(); } + + + private static class ASTKey { + private final ASTNode root; + + public ASTKey(ASTNode root) { + this.root = root; + } + + @Override + public boolean equals(Object o) { + if (this == o) return true; + if (o == null || getClass() != o.getClass()) return false; + ASTKey that = (ASTKey) o; + return equals(root, that.root); + } + + private boolean equals(ASTNode astNode1, ASTNode astNode2) { + if (!(astNode1.getType() == astNode2.getType() && + astNode1.getText().equals(astNode2.getText()) && + astNode1.getChildCount() == astNode2.getChildCount())) { + return false; + } + + for (int i = 0; i < astNode1.getChildCount(); ++i) { + if (!equals((ASTNode) astNode1.getChild(i), (ASTNode) astNode2.getChild(i))) { + return false; + } + } + + return true; + } + + @Override + public int hashCode() { + return hashcode(root); Review Comment: * Hashcode of the ASTs stored in the `MaterializedViewCache` calculated only once: when the MVs are loaded when hs2 starts or a new MV is created because Java hashmap implementation caches the key's hashcode. * When we look-up a Materialization the hashcode of the key is calculated every time the get method is called. This is called only once for the entire tree per query. * To find sub-query rewrites the look-up is done by sub AST-s and the hashcode is also calculated for the subTrees but when I did some performance tests locally I didn't found this as a bottleneck. This solution is still much faster then generating the expanded query text of every possible sub-query using `UnparseTranslator` and `TokenRewriteStream`. Issue Time Tracking ------------------- Worklog Id: (was: 755214) Time Spent: 1h 20m (was: 1h 10m) > Long compilation time of complex query due to analysis for materialized view > rewrite > ------------------------------------------------------------------------------------ > > Key: HIVE-25941 > URL: https://issues.apache.org/jira/browse/HIVE-25941 > Project: Hive > Issue Type: Bug > Components: Materialized views > Reporter: Krisztian Kasa > Assignee: Krisztian Kasa > Priority: Major > Labels: pull-request-available > Attachments: sample.png > > Time Spent: 1h 20m > Remaining Estimate: 0h > > When compiling query the optimizer tries to rewrite the query plan or > subtrees of the plan to use materialized view scans. > If > {code} > set hive.materializedview.rewriting.sql.subquery=false; > {code} > the compilation succeed in less then 10 sec otherwise it takes several > minutes (~ 5min) depending on the hardware. -- This message was sent by Atlassian Jira (v8.20.1#820001)