[ https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135643#comment-15135643 ]
Hari Sankar Sivarama Subramaniyan commented on HIVE-12923: ---------------------------------------------------------- [~jcamachorodriguez] Thanks for the review comments. 1. I totally agree that this can potentially cause regression in some cases on join merging in the return path. As commented in the fix, this should be removed once CALCITE-1069 is fixed, but this fix is better than throwing an exception to the end user. One way to optimize this would be to check for the only first AGGREGATE node below the current PROJECT instead of the entire subtree. Does that sound fine ? 2. I dont totally get your second point. There can be a filter between the Project and Aggregate as JOIN->PROJECT->FILTER->AGGREGATE, for e.g. NOT NULL filter, but I believe the PROJECT is always present somewhere above a AGGREGATE after CalcitePlanner.genLogicalPlan(QB qb, boolean outerMostQB) introduces a SELECT to remove the indicator columns from the GBY rowschema. If you are referring to FILTER above the PROJECT, then the HiveJoinProjectTranspose rule shouldnt kick in, right? Thanks Hari > CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver > groupby_grouping_sets4.q failure > ------------------------------------------------------------------------------------------------------------ > > Key: HIVE-12923 > URL: https://issues.apache.org/jira/browse/HIVE-12923 > Project: Hive > Issue Type: Sub-task > Components: CBO > Reporter: Hari Sankar Sivarama Subramaniyan > Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch > > > {code} > EXPLAIN > SELECT * FROM > (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1 > join > (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2 > on subq1.a = subq2.a > {code} > Stack trace: > {code} > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103) > at > org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444) > at > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)