[ 
https://issues.apache.org/jira/browse/IMPALA-14469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18029388#comment-18029388
 ] 

ASF subversion and git services commented on IMPALA-14469:
----------------------------------------------------------

Commit cde4bc016c02cf582f2469083392b0bcc7f2bf56 in impala's branch 
refs/heads/master from Steve Carlin
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=cde4bc016 ]

IMPALA-14115: Calcite planner: Added top-n analytic PlanNode optimization.

Impala has an optimization for analytic expressions that have a rank filter on
top of the analytic expression. It can add a top-n plan node to reduce the 
amount
of rows examined. This is tested in tpcds query 67.

The optimization logic relies on an unassigned rank conjunct within the analyzer
while creating the analytic plan node.

A slight reorganization of the code was needed to implement this optimization.
The SlotRefs for the AnalyticInfo needed to be created a little earlier from
where it was done in the previous commit.

A small fix was made to normalize binary predicates. A non-normalized binary
predicate prevents the optimization from being used.

A call to the checkAndApplyLimitPushdown is needed for some of the optimizations
to kick in.

A new AllProjectInfo internal class was created to hold the relationships
between the Calcite RexNode objects and the Impala Analytic expressions.

Also, IMPALA-14158 is fixed by this commit. The nullsFirst value was
incorrect when the syntax was explicit in the query.

A new Calcite planner test was added in the junit tests to ensure the
optimization kicks in. The new test file is in the
PlannerTest/calcite/limit-pushdown-analytic-calcite.test file. This is a copy
of the limit-pushdown-analytic.test file in its parent directory but with some
modified results. Most of the differences are trivial, but IMPALA-14469 has been
filed to deal with one optimization that did not get fixed, which is when
the order by clause has a constant expression.

Change-Id: Ie6fa6781db56771b13b0cf49bd236f776016bf8d
Reviewed-on: http://gerrit.cloudera.org:8080/23317
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Aman Sinha <[email protected]>


> Calcite Planner: top-n optimization not working with constant in order by
> -------------------------------------------------------------------------
>
>                 Key: IMPALA-14469
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14469
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Steve Carlin
>            Priority: Major
>
> After IMPALA-14415 gets committed, there is still a problem in the 
> limit-pushdown-analytic-calcite.test file (which will be marked with this 
> Jira number)
> The first order by expression is a constant.  Since this matches the analytic 
> expression, we should be able to use top-n optimization.
> Instead, what is happening in the code is that the constant is optimized out 
> in the sort expression but not in the analytic expression.  So the top-n 
> optimization is not kicking in.
> Pushing this out of the v1 cut because this is an edge case.  The workaround 
> is to remove the constant from the order by expression which is essentially a 
> nop.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to