[ https://issues.apache.org/jira/browse/HIVE-10369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498904#comment-14498904 ]
Mostafa Mokhtar commented on HIVE-10369: ---------------------------------------- [~jcamachorodriguez] > CBO: Don't use HiveDefaultCostModel when With Tez and > hive.cbo.costmodel.extended enabled > ------------------------------------------------------------------------------------------ > > Key: HIVE-10369 > URL: https://issues.apache.org/jira/browse/HIVE-10369 > Project: Hive > Issue Type: Sub-task > Components: CBO > Affects Versions: 1.2.0 > Reporter: Mostafa Mokhtar > Assignee: Laljo John Pullokkaran > Fix For: 1.2.0 > > > When calculating parallelism, we end up using HiveDefaultCostModel. > getSplitCount which returns null instead of HiveOnTezCostModel.getSplitCount > which results in wrong parallelism. > This happens for this join > {code} > org.apache.calcite.plan.RelOptUtil.toString(join) > (java.lang.String) HiveJoin(condition=[=($1, $3)], joinType=[inner], > algorithm=[none], cost=[not available]) > HiveProject(cs_sold_date_sk=[$0], cs_bill_customer_sk=[$3], > cs_sales_price=[$21]) > HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]]) > HiveJoin(condition=[=($1, $2)], joinType=[inner], algorithm=[MapJoin], > cost=[{2400000.0 rows, 6.400008E11 cpu, 1294.6098 io}]) > HiveProject(c_customer_sk=[$0], c_current_addr_sk=[$4]) > HiveTableScan(table=[[tpcds_bin_orc_200.customer]]) > HiveProject(ca_address_sk=[$0], ca_state=[$8], ca_zip=[$9]) > HiveTableScan(table=[[tpcds_bin_orc_200.customer_address]]) > {code} > The issue appears to be happening very early when calling > {code} > if (pushDownTree != null) { > costPushDown = > RelMetadataQuery.getCumulativeCost(pushDownTree.getJoinTree()); > } > {code} > As pushDownTree.getJoinTree().joinAlgorithm = > HiveOnTezCostModel$TezMapJoinAlgorithm > Call stack. > {code} > HiveDefaultCostModel$DefaultJoinAlgorithm.getSplitCount(HiveJoin) line: 114 > HiveJoin.getSplitCount() line: 136 > HiveRelMdParallelism.splitCount(HiveJoin) line: 63 > NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not > available [native method] > NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57 > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182 > $Proxy46.splitCount() line: not available > GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, > Object[]) line: 109 > $Proxy46.splitCount() line: not available > GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, > Object[]) line: 109 > $Proxy46.splitCount() line: not available > GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, > Object[]) line: 132 > $Proxy46.splitCount() line: not available > RelMetadataQuery.splitCount(RelNode) line: 401 > HiveOnTezCostModel$TezMapJoinAlgorithm.getCost(HiveJoin) line: 255 > HiveOnTezCostModel(HiveCostModel).getJoinCost(HiveJoin) line: 64 > HiveRelMdCost.getNonCumulativeCost(HiveJoin) line: 56 > NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not > available [native method] > NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57 > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182 > $Proxy41.getNonCumulativeCost() line: not available > GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, > Object[]) line: 109 > $Proxy41.getNonCumulativeCost() line: not available > GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, > Object[]) line: 109 > $Proxy41.getNonCumulativeCost() line: not available > GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, > Object[]) line: 109 > $Proxy41.getNonCumulativeCost() line: not available > GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, > Object[]) line: 132 > $Proxy41.getNonCumulativeCost() line: not available > RelMetadataQuery.getNonCumulativeCost(RelNode) line: 115 > HiveRelMdDistinctRowCount.getCumulativeCost(HiveJoin) line: 114 > NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not > available [native method] > NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57 > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182 > $Proxy40.getCumulativeCost() line: not available > GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, > Object[]) line: 109 > $Proxy40.getCumulativeCost() line: not available > GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, > Object[]) line: 109 > $Proxy40.getCumulativeCost() line: not available > GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > Method.invoke(Object, Object...) line: 606 > CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, > Object[]) line: 132 > $Proxy40.getCumulativeCost() line: not available > RelMetadataQuery.getCumulativeCost(RelNode) line: 101 > LoptOptimizeJoinRule.addFactorToTree(LoptMultiJoin, LoptSemiJoinOptimizer, > LoptJoinTree, int, BitSet, List<RexNode>, boolean) line: 940 > LoptOptimizeJoinRule.createOrdering(LoptMultiJoin, LoptSemiJoinOptimizer, > int) line: 726 > LoptOptimizeJoinRule.findBestOrderings(LoptMultiJoin, LoptSemiJoinOptimizer, > RelOptRuleCall) line: 458 > LoptOptimizeJoinRule.onMatch(RelOptRuleCall) line: 128 > HepPlanner(AbstractRelOptPlanner).fireRule(RelOptRuleCall) line: 326 > HepPlanner.applyRule(RelOptRule, HepRelVertex, boolean) line: 515 > HepPlanner.applyRules(Collection<RelOptRule>, boolean) line: 392 > HepPlanner.executeInstruction(HepInstruction$RuleInstance) line: 255 > HepInstruction$RuleInstance.execute(HepPlanner) line: 125 > HepPlanner.executeProgram(HepProgram) line: 207 > HepPlanner.findBestExp() line: 194 > CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, > SchemaPlus) line: 849 > CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, > SchemaPlus) line: 761 > Frameworks$1.apply(RelOptCluster, RelOptSchema, SchemaPlus, > CalciteServerStatement) line: 109 > CalcitePrepareImpl.perform(CalciteServerStatement, PrepareAction<R>) line: > 730 > Frameworks.withPrepare(PrepareAction<R>) line: 145 > Frameworks.withPlanner(PlannerAction<R>, FrameworkConfig) line: 105 > CalcitePlanner.getOptimizedAST() line: 602 > CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) line: 240 > CalcitePlanner(SemanticAnalyzer).analyzeInternal(ASTNode, > SemanticAnalyzer$PlannerContext) line: 10003 > CalcitePlanner.analyzeInternal(ASTNode) line: 203 > CalcitePlanner(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224 > ExplainSemanticAnalyzer.analyzeInternal(ASTNode) line: 74 > ExplainSemanticAnalyzer(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: > 224 > Driver.compile(String, boolean) line: 424 > Driver.compile(String) line: 308 > Driver.compileInternal(String) line: 1122 > Driver.runInternal(String, boolean) line: 1170 > Driver.run(String, boolean) line: 1059 > Driver.run(String) line: 1049 > CliDriver.processLocalCmd(String, CommandProcessor, CliSessionState) line: > 213 > CliDriver.processCmd(String) line: 165 > CliDriver.processLine(String, boolean) line: 376 > CliDriver.executeDriver(CliSessionState, HiveConf, OptionsProcessor) line: > 736 > CliDriver.run(String[]) line: 681 > CliDriver.main(String[]) line: 621 > NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not > available [native method] > NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57 > DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)