Mostafa Mokhtar created HIVE-10369:
--------------------------------------

             Summary: CBO: Don't use HiveDefaultCostModel when With Tez and 
hive.cbo.costmodel.extended enabled 
                 Key: HIVE-10369
                 URL: https://issues.apache.org/jira/browse/HIVE-10369
             Project: Hive
          Issue Type: Sub-task
          Components: CBO
    Affects Versions: 1.2.0
            Reporter: Mostafa Mokhtar
            Assignee: Laljo John Pullokkaran
             Fix For: 1.2.0


When calculating parallelism, we end up using  HiveDefaultCostModel. 
getSplitCount which returns null instead of  HiveOnTezCostModel.getSplitCount 
which results in wrong parallelism.

This happens for this join 
{code}
org.apache.calcite.plan.RelOptUtil.toString(join)
         (java.lang.String) HiveJoin(condition=[=($1, $3)], joinType=[inner], 
algorithm=[none], cost=[not available])
  HiveProject(cs_sold_date_sk=[$0], cs_bill_customer_sk=[$3], 
cs_sales_price=[$21])
    HiveTableScan(table=[[tpcds_bin_orc_200.catalog_sales]])
  HiveJoin(condition=[=($1, $2)], joinType=[inner], algorithm=[MapJoin], 
cost=[{2400000.0 rows, 6.400008E11 cpu, 1294.6098 io}])
    HiveProject(c_customer_sk=[$0], c_current_addr_sk=[$4])
      HiveTableScan(table=[[tpcds_bin_orc_200.customer]])
    HiveProject(ca_address_sk=[$0], ca_state=[$8], ca_zip=[$9])
      HiveTableScan(table=[[tpcds_bin_orc_200.customer_address]])
{code}


The issue appears to be happening very early when calling 
{code}
if (pushDownTree != null) {
      costPushDown =
          RelMetadataQuery.getCumulativeCost(pushDownTree.getJoinTree());
    }
{code}

As pushDownTree.getJoinTree().joinAlgorithm = 
HiveOnTezCostModel$TezMapJoinAlgorithm


Call stack.
{code}
HiveDefaultCostModel$DefaultJoinAlgorithm.getSplitCount(HiveJoin) line: 114     
HiveJoin.getSplitCount() line: 136      
HiveRelMdParallelism.splitCount(HiveJoin) line: 63      
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available 
[native method]  
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57      
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182    
$Proxy46.splitCount() line: not available       
GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, 
Object[]) line: 109  
$Proxy46.splitCount() line: not available       
GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, 
Object[]) line: 109  
$Proxy46.splitCount() line: not available       
GeneratedMethodAccessor26.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, 
Object[]) line: 132  
$Proxy46.splitCount() line: not available       
RelMetadataQuery.splitCount(RelNode) line: 401  
HiveOnTezCostModel$TezMapJoinAlgorithm.getCost(HiveJoin) line: 255      
HiveOnTezCostModel(HiveCostModel).getJoinCost(HiveJoin) line: 64        
HiveRelMdCost.getNonCumulativeCost(HiveJoin) line: 56   
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available 
[native method]  
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57      
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182    
$Proxy41.getNonCumulativeCost() line: not available     
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, 
Object[]) line: 109  
$Proxy41.getNonCumulativeCost() line: not available     
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, 
Object[]) line: 109  
$Proxy41.getNonCumulativeCost() line: not available     
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, 
Object[]) line: 109  
$Proxy41.getNonCumulativeCost() line: not available     
GeneratedMethodAccessor22.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, 
Object[]) line: 132  
$Proxy41.getNonCumulativeCost() line: not available     
RelMetadataQuery.getNonCumulativeCost(RelNode) line: 115        
HiveRelMdDistinctRowCount.getCumulativeCost(HiveJoin) line: 114 
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available 
[native method]  
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57      
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ReflectiveRelMetadataProvider$1$1.invoke(Object, Method, Object[]) line: 182    
$Proxy40.getCumulativeCost() line: not available        
GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, 
Object[]) line: 109  
$Proxy40.getCumulativeCost() line: not available        
GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(Object, Method, 
Object[]) line: 109  
$Proxy40.getCumulativeCost() line: not available        
GeneratedMethodAccessor21.invoke(Object, Object[]) line: not available  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606      
CachingRelMetadataProvider$CachingInvocationHandler.invoke(Object, Method, 
Object[]) line: 132  
$Proxy40.getCumulativeCost() line: not available        
RelMetadataQuery.getCumulativeCost(RelNode) line: 101   
LoptOptimizeJoinRule.addFactorToTree(LoptMultiJoin, LoptSemiJoinOptimizer, 
LoptJoinTree, int, BitSet, List<RexNode>, boolean) line: 940 
LoptOptimizeJoinRule.createOrdering(LoptMultiJoin, LoptSemiJoinOptimizer, int) 
line: 726        
LoptOptimizeJoinRule.findBestOrderings(LoptMultiJoin, LoptSemiJoinOptimizer, 
RelOptRuleCall) line: 458  
LoptOptimizeJoinRule.onMatch(RelOptRuleCall) line: 128  
HepPlanner(AbstractRelOptPlanner).fireRule(RelOptRuleCall) line: 326    
HepPlanner.applyRule(RelOptRule, HepRelVertex, boolean) line: 515       
HepPlanner.applyRules(Collection<RelOptRule>, boolean) line: 392        
HepPlanner.executeInstruction(HepInstruction$RuleInstance) line: 255    
HepInstruction$RuleInstance.execute(HepPlanner) line: 125       
HepPlanner.executeProgram(HepProgram) line: 207 
HepPlanner.findBestExp() line: 194      
CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, 
SchemaPlus) line: 849    
CalcitePlanner$CalcitePlannerAction.apply(RelOptCluster, RelOptSchema, 
SchemaPlus) line: 761    
Frameworks$1.apply(RelOptCluster, RelOptSchema, SchemaPlus, 
CalciteServerStatement) line: 109   
CalcitePrepareImpl.perform(CalciteServerStatement, PrepareAction<R>) line: 730  
Frameworks.withPrepare(PrepareAction<R>) line: 145      
Frameworks.withPlanner(PlannerAction<R>, FrameworkConfig) line: 105     
CalcitePlanner.getOptimizedAST() line: 602      
CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) line: 240    
CalcitePlanner(SemanticAnalyzer).analyzeInternal(ASTNode, 
SemanticAnalyzer$PlannerContext) line: 10003  
CalcitePlanner.analyzeInternal(ASTNode) line: 203       
CalcitePlanner(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 224        
ExplainSemanticAnalyzer.analyzeInternal(ASTNode) line: 74       
ExplainSemanticAnalyzer(BaseSemanticAnalyzer).analyze(ASTNode, Context) line: 
224       
Driver.compile(String, boolean) line: 424       
Driver.compile(String) line: 308        
Driver.compileInternal(String) line: 1122       
Driver.runInternal(String, boolean) line: 1170  
Driver.run(String, boolean) line: 1059  
Driver.run(String) line: 1049   
CliDriver.processLocalCmd(String, CommandProcessor, CliSessionState) line: 213  
CliDriver.processCmd(String) line: 165  
CliDriver.processLine(String, boolean) line: 376        
CliDriver.executeDriver(CliSessionState, HiveConf, OptionsProcessor) line: 736  
CliDriver.run(String[]) line: 681       
CliDriver.main(String[]) line: 621      
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available 
[native method]  
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57      
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to