[ 
https://issues.apache.org/jira/browse/CALCITE-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18011970#comment-18011970
 ] 

Julian Hyde commented on CALCITE-7125:
--------------------------------------

Having multiple rule instances (with slightly different behaviors) is one of 
the easiest ways to configure Calcite. Would it be feasible to have two 
variants of {{{}IntersectToDistinctRule{}}}, with behaviors pre- and 
post-CALCITE-6893? (I'm just asking, at this point. I haven't thought through 
Alessandro's scenario in depth.)

> Impossible to get a plan with partial aggregate push-down via 
> IntersectToDistinctRule
> -------------------------------------------------------------------------------------
>
>                 Key: CALCITE-7125
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7125
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.40.0
>            Reporter: Alessandro Solimando
>            Priority: Major
>              Labels: breaking_change
>
> CALCITE-6893 changed _IntersectToDistinctRule_ to rewrite "INTERSECT 
> DISTINCT" queries using a tagging plus filtering strategy rather than pushing 
> aggregates into the "UNION" inputs.
> Rationale was to let other rules (_AGGREGATE_UNION_TRANSPOSE_ is mentioned in 
> the ticket description) produce the plan with partial aggregation push-down, 
> and let the CBO do the choice.
> Consider the following query:
> {noformat}
> SELECT empno FROM emp
> INTERSECT
> SELECT empno FROM emp
> {noformat}
> _IntersectToDistinctRule_, as per CALCITE-6893, would rewrite it as follows:
> {noformat}
> LogicalProject(EMPNO=[$0])
>   LogicalFilter(condition=[AND(>($1, 0), >($2, 0))])
>     LogicalAggregate(group=[{0}], count_i0=[COUNT() FILTER (WHERE i = 0)], 
> count_i1=[COUNT() FILTER (WHERE i = 1)])
>       LogicalUnion(all=[true])
>         LogicalProject(EMPNO=[$0], i=[0]) // "tag" i=0
>           TableScan(emp)
>         LogicalProject(EMPNO=[$0], i=[1]) // "tag" i=1
>           TableScan(emp)
> {noformat}
> This structure introduces:
>  - A tagging column "i" added via projection
>  - Two Projects wrapping each input
>  - A single Aggregate with COUNT() FILTER expressions over those tags
>  - A final filter requiring count_iN > 0, for N in "1..#operands"
> With this rewrite, the planner:
>  - Can no longer push aggregates into the union inputs
>  - Cannot apply rules like 
> _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ or 
> _AGGREGATE_PROJECT_MERGE_/_AggregateProjectMergeRule_
>  - Fails to recover the earlier, sometimes more efficient plan shape:
> {noformat}
> LogicalAggregate(group=[{0}])
>   LogicalUnion(all=[true])
>     LogicalAggregate(group=[{0}])
>       ...
>     LogicalAggregate(group=[{0}])
>       ...
> {noformat}
> This prevents early row reduction and downstream rule composition, even when 
> all relevant transformation rules are enabled.
> Prior to CALCITE-6893, the same query with the same planner configuration 
> produced the optimized form above. That plan is no longer reachable with the 
> current {_}IntersectToDistinctRule{_}, and no combination of core rules 
> (including _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_, 
> _PROJECT_SET_OP_TRANSPOSE_/_ProjectSetOpTransposeRule_, etc.) can recover it.
> Unless I am mistaken and someone can suggest further rules that can lead to 
> the same plan as before, this seems like a regression.
> My proposal would be to either revert the change, or to at least expose the 
> "old" behavior via a configuration flag.
> My understanding is that, the original rewrite proposed in CALCITE-6893 was 
> supposed to rewritable (by means of 
> _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ rule). The rewrite 
> was incorrect and the currently implemented form has been suggested, but it 
> hasn't been discussed if the "new" plan shape (post CALCITE-6893) could be 
> transformed into the "old" plan shape (pre CALCITE-6893).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to