[
https://issues.apache.org/jira/browse/CALCITE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18062300#comment-18062300
]
Julian Hyde commented on CALCITE-7420:
--------------------------------------
{quote}1. Currently, {{!sub-plan}} and {{!ok}} run different rules, which might
cause misunderstandings. Or, could I simply apply Rule1 and Rule2 to this SQL
and observe the results? This issue could perhaps be discussed in another
PR.{quote}
I acknowledge that there is a conceptual gap.
I can't think of an elegant way to close the gap - the rules being tested are
usually logical, so a lot of rules would have to be supplied to take it to a
physical form, say by applying enumerable rules. But note that any valid
implementation would return the same results.
I think it is useful for the reader of the .iq script to know that the query is
valid, what is its type, and to see the results on a real data set. It simply
makes things more concrete.
Some of the differences today are because the mock schema the scott schema have
diverged - the NAME column in one is called DNAME in the other, the DEPTNO
column has types TINYINT and INT, and the SAL column has types INT and DECIMAL.
Running the tests is proving to be a (good) forcing function to converge those
schemas.
{quote}2. We now support a new decorrelator algorithm,
TopDownGeneralDecorrelator. Should we support it in {{!sub-plan}}?{quote}
If it's a planner rule, yes. If it's something else, probably not.
{{RelOptRulesTest}} (and this set of .iq-based tests) tests planner rules. It
does that by checking that plan A becomes plan B when we apply the rule. We use
SQL and pre-rules as a convenient way to create plan A. If
{{TopDownGeneralDecorrelator}} is the *only* way to create plan A, I guess I'd
allow it. But if that plan can be created by writing a particular SQL query,
maybe the test should do that.
The {{!sub-plan}} approach could be repeated in other areas of Calcite, such as
{{SqlToRelConverterTest}}, which I guess includes
{{TopDownGeneralDecorrelator}}.
{quote}3. How can we tell which tests have been migrated? Will you delete the
migrated cases in RelOptRulesTest at the end? The current PR doesn’t seem to
have done that yet.{quote}
Yes, I am deleting methods from {{RelOptRulesTest.java}} and XML hunks from
{{RelOptRulesTest.xml}} as I go. The diff is a little confused at this point -
it shows 17,000 lines deleted, 5,700 lines added in \{{RelOptRulesTest.xml}},
when in fact the only changes are deletions - but will get clearer when we've
deleted more XML hunks.
> Convert RelOptRulesTest to Quidem scripts
> -----------------------------------------
>
> Key: CALCITE-7420
> URL: https://issues.apache.org/jira/browse/CALCITE-7420
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
> Priority: Major
> Labels: pull-request-available
>
> Many test cases in RelOptRulesTest are simple - involve just a SQL query, set
> of rules, before and after plan. If these were converted to Quidem scripts
> they would be easier to write (because you are writing just one file), more
> maintainable (because Quidem scripts merge more easily than XML), and could
> potentially be run from another language (which would enable ports of Calcite
> to language such as Rust or Python).
> Here is how one test would look:
> {noformat}
> # Pull aggregate through union ====================
> # Tests that AggregateUnionAggregateRule can convert
> # a union of two ’select distinct’ queries to a ’select distinct’
> # of a union.
> !set rules "AGGREGATE_UNION_AGGREGATE"
> select deptno, job
> from (select deptno, job
> from emp as e1
> group by deptno, job
> union all
> select deptno, job
> from emp as e2
> group by deptno, job)
> group by deptno, job;
> +--------+-----------+
> | DEPTNO | JOB |
> +--------+-----------+
> | 20 | CLERK |
> | 30 | SALESMAN |
> | 20 | MANAGER |
> | 30 | MANAGER |
> | 10 | MANAGER |
> | 20 | ANALYST |
> | 10 | PRESIDENT |
> | 30 | CLERK |
> | 10 | CLERK |
> +--------+-----------+
> (9 rows)
> !ok
> LogicalAggregate(group=[{0, 1}])
> LogicalUnion(all=[true])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$7], JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$7], JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> !plan-before
> LogicalAggregate(group=[{0, 1}])
> LogicalUnion(all=[true])
> LogicalProject(DEPTNO=[$7], JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalProject(DEPTNO=[$7], JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> !plan-after
> {noformat}
>
> It has the same information as the Java method
> {code:java}
> @Test void testPullAggregateThroughUnion() {
> final String sql = "select deptno, job from"
> + " (select deptno, job from emp as e1"
> + " group by deptno,job"
> + " union all"
> + " select deptno, job from emp as e2"
> + " group by deptno,job)"
> + " group by deptno,job";
> sql(sql)
> .withRule(CoreRules.AGGREGATE_UNION_AGGREGATE)
> .check();
> }
> {code}
> and XML resources
> {code:xml}
> <TestCase name="testPullAggregateThroughUnion">
> <Resource name="sql">
> <![CDATA[select deptno, job from (select deptno, job from emp as e1
> group by deptno,job union all select deptno, job from emp as e2 group by
> deptno,job) group by deptno,job]]>
> </Resource>
> <Resource name="planBefore">
> <![CDATA[
> LogicalAggregate(group=[{0, 1}])
> LogicalUnion(all=[true])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$7], JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalAggregate(group=[{0, 1}])
> LogicalProject(DEPTNO=[$7], JOB=[$2])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> ]]>
> </Resource>
> {code}
> It uses {{!set rules}} command (added in CALCITE-6335) and would require new
> {{!plan-before}} and {{!plan-after}} commands.
> I suspect that Claude would do a quite good job migrating tests from .java
> and .xml to .iq. I think it could also group tests by subject area (e.g. join
> and aggregate) so that .iq files are ~1,000 lines.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)