Alessandro Solimando created CALCITE-7125: ---------------------------------------------
Summary: Impossible to get a plan with partial aggregate push-down via IntersectToDistinctRule Key: CALCITE-7125 URL: https://issues.apache.org/jira/browse/CALCITE-7125 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.40.0 Reporter: Alessandro Solimando CALCITE-6893 changed _IntersectToDistinctRule_ to rewrite "INTERSECT DISTINCT" queries using a tagging plus filtering strategy rather than pushing aggregates into the "UNION" inputs. Rationale was to let other rules (_AGGREGATE_UNION_TRANSPOSE_ is mentioned in the ticket description) produce the plan with partial aggregation push-down, and let the CBO do the choice. Consider the following query: {noformat} SELECT empno FROM emp INTERSECT SELECT empno FROM emp {noformat} _IntersectToDistinctRule_, as per CALCITE-6893, would rewrite it as follows: {noformat} LogicalProject(EMPNO=[$0]) LogicalFilter(condition=[AND(>($1, 0), >($2, 0))]) LogicalAggregate(group=[{0}], count_i0=[COUNT() FILTER (WHERE i = 0)], count_i1=[COUNT() FILTER (WHERE i = 1)]) LogicalUnion(all=[true]) LogicalProject(EMPNO=[$0], i=[0]) // "tag" i=0 TableScan(emp) LogicalProject(EMPNO=[$0], i=[1]) // "tag" i=1 TableScan(emp) {noformat} This structure introduces: - A tagging column "i" added via projection - Two Projects wrapping each input - A single Aggregate with COUNT() FILTER expressions over those tags - A final filter requiring count_iN > 0, for N in "1..#operands" With this rewrite, the planner: - Can no longer push aggregates into the union inputs - Cannot apply rules like _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ or _AGGREGATE_PROJECT_MERGE_/_AggregateProjectMergeRule_ - Fails to recover the earlier, sometimes more efficient plan shape: {noformat} LogicalAggregate(group=[{0}]) LogicalUnion(all=[true]) LogicalAggregate(group=[{0}]) ... LogicalAggregate(group=[{0}]) ... {noformat} This prevents early row reduction and downstream rule composition, even when all relevant transformation rules are enabled. Prior to CALCITE-6893, the same query with the same planner configuration produced the optimized form above. That plan is no longer reachable with the current {_}IntersectToDistinctRule{_}, and no combination of core rules (including _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_, _PROJECT_SET_OP_TRANSPOSE_/_ProjectSetOpTransposeRule_, etc.) can recover it. Unless I am mistaken and someone can suggest further rules that can lead to the same plan as before, this seems like a regression. My proposal would be to either revert the change, or to at least expose the "old" behavior via a configuration flag. My understanding is that, the original rewrite proposed in CALCITE-6893 was supposed to rewritable (by means of _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ rule). The rewrite was incorrect and the currently implemented form has been suggested, but it hasn't been discussed if the "new" plan shape (post CALCITE-6893) could be transformed into the "old" plan shape (pre CALCITE-6893). -- This message was sent by Atlassian Jira (v8.20.10#820010)