Arnaud Jegou created CALCITE-7144: ------------------------------------- Summary: LIMIT should not be pushed through projections containing window functions Key: CALCITE-7144 URL: https://issues.apache.org/jira/browse/CALCITE-7144 Project: Calcite Issue Type: Bug Affects Versions: 1.40.0 Reporter: Arnaud Jegou
_This is an issue with the RelBuilder, we have not noticed this behaviour with rules_ When adding a limit/offset with no collation, the RelBuilder will try to push it down to a lower sort with no limit or offset, even if there is a project in between Basically instead of generating the following: {code:java} LogicalSort(fetch=[10]) LogicalProject(xxx) LogicalSort(sort0=[$1], dir0=[ASC]) {code} it will generate: {code:java} LogicalProject(xxx) LogicalSort(fetch=[10], sort0=[$1], dir0=[ASC]) {code} This is valid for "regular" projects, but not if the project contains a window function as in this case the limit would be applied before the aggregation and produce invalid results What actually happens in the code is, in [RelBuilder.sortLimit:|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3662] # if there is [no collation|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3687] # and the current top node [is a project|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3706] # and the input of that project [is a Sort|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3708] # and that sort has [no limit or offset|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3710] # then the "new" limit and offset are [pushed down into that sort|https://github.com/apache/calcite/blob/66afac689ea5b3f59c8050f9a59619c5c5bee771/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L3711] {code:java} if (fieldCollations.isEmpty()) { // 1 // ... if (top instanceof Project) { // 2 final Project project = (Project) top; if (project.getInput() instanceof Sort) { // 3 final Sort sort2 = (Sort) project.getInput(); if (sort2.offset == null && sort2.fetch == null) { // 4 final RelNode sort = struct.sortFactory.createSort(sort2.getInput(), sort2.collation, offsetNode, fetchNode); // 5 replaceTop( struct.projectFactory.createProject(sort, project.getHints(), project.getProjects(), Pair.right(project.getNamedProjects()), project.getVariablesSet())); return this; {code} One possible fix would simply be to check that the project has no window function in 3) by replacing the condition with {code:java} if (!project.containsOver() && project.getInput() instanceof Sort) { {code} I am not 100% sure that this is the right fix but it looks correct, and the tests do run against a version containing this change without errors Here is a test case that covers this scenario: {code:java} /** * Test case for * <a href="https://issues.apache.org/jira/browse/CALCITE-XXX">[CALCITE-XXX] * RelBuilder pushes limits through window functions which is invalid</a>. * The test checks that the RelBuilder does not merge a limit through a Project * that contains a windowed aggregate function into a lower sort. */ @Test void testLimitOverProjectWithWindowFunctions() { final Function<RelBuilder, RelNode> f = b -> b.scan("EMP") .sort(1) .project(b.field("DEPTNO"), b.aggregateCall(SqlStdOperatorTable.ROW_NUMBER) .over() .partitionBy() .orderBy(b.field("EMPNO")) .rowsUnbounded() .as("x")) .limit(0, 10) .build(); final String expected = "" + "LogicalSort(fetch=[10])\n" + " LogicalProject(DEPTNO=[$7], x=[ROW_NUMBER() OVER (ORDER BY $0)])\n" + " LogicalSort(sort0=[$1], dir0=[ASC])\n" + " LogicalTableScan(table=[[scott, EMP]])\n"; assertThat(f.apply(createBuilder()), hasTree(expected)); }{code} And I opened a pr to show what the fix would look like, if the proposed fix is correct: [https://github.com/apache/calcite/pull/4503] (but feel free to discard it and create a new one if need be) -- This message was sent by Atlassian Jira (v8.20.10#820010)