[ 
https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135643#comment-15135643
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12923:
----------------------------------------------------------

[~jcamachorodriguez] Thanks for the review comments.
1.  I totally agree that this can potentially cause regression in some cases on 
join merging in the return path. As commented in the fix, this should be 
removed once CALCITE-1069 is fixed, but this fix is better than throwing an 
exception to the end user. One way to optimize this would be to check for the 
only first AGGREGATE node below the current PROJECT instead of the entire 
subtree. Does that sound fine ?

2. I dont totally get your second point. There can be a filter between the 
Project and Aggregate as JOIN->PROJECT->FILTER->AGGREGATE, for e.g. NOT NULL 
filter, but I believe the PROJECT is always present somewhere above a AGGREGATE 
after  CalcitePlanner.genLogicalPlan(QB qb, boolean outerMostQB) introduces a 
SELECT to remove the indicator columns from the GBY rowschema. If you are 
referring to FILTER above the PROJECT, then the HiveJoinProjectTranspose rule 
shouldnt kick in, right?

Thanks
Hari



> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> groupby_grouping_sets4.q failure
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-12923
>                 URL: https://issues.apache.org/jira/browse/HIVE-12923
>             Project: Hive
>          Issue Type: Sub-task
>          Components: CBO
>            Reporter: Hari Sankar Sivarama Subramaniyan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>         Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch
>
>
> {code}
> EXPLAIN
> SELECT * FROM
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1
> join
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2
> on subq1.a = subq2.a
> {code}
> Stack trace:
> {code}
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110)
>         at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85)
>         at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941)
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>         at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172)
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>         at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135)
>         at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237)
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176)
>         at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229)
>         at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
>         at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>         at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312)
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>         at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
>         at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129)
>         at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103)
>         at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444)
>         at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to