[
https://issues.apache.org/jira/browse/CALCITE-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896003#comment-17896003
]
Ruben Q L commented on CALCITE-6674:
------------------------------------
"Beneficial" might be a bit subjective here, and might depend on a case by
case. We could somehow make a cost-based decission comparing rule decorrelation
and main logic decorrelation, but that might be a bit of a overkill and could
break backwards compatibility in some cases.
IMO making this configurable has the advantage that the caller can not only
remove some of the "default" rules, but also add new customized rules that may
make sense on their particular decorrelation scenario.
> Make RelDecorrelator rules configurable
> ---------------------------------------
>
> Key: CALCITE-6674
> URL: https://issues.apache.org/jira/browse/CALCITE-6674
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Reporter: Ruben Q L
> Assignee: Ruben Q L
> Priority: Minor
> Labels: pull-request-available
> Fix For: 1.39.0
>
>
> RelDecorrelator decorrelates a query in two main steps:
> - First, a few Correlates cases are removed via rules (in
> {{removeCorrelationViaRule}} method).
> - Then, the main decorrelation logic is applied ({{{}decorrelateRel{}}}
> methods called by reflection).
> Currently, the rules applied on the first step are hardcoded, and cannot be
> configured.
> We are facing a situation where a Correlate is converted via one of these
> hardcoded rules (RemoveCorrelationForScalarAggregateRule), when in fact the
> main decorrelation logic (if the rule were not applied) would offer an
> arguably more beneficial plan:
> {noformat}
> -- Original (sub)plan
> LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0,
> 1}])
> LogicalJoin(condition=[=($0, $5)], joinType=[inner])
> LogicalTableScan(table=[[session, partsupp]])
> LogicalProject(p_partkey=[$0])
> LogicalFilter(condition=[LIKE($1, 'forest%':VARCHAR)])
> LogicalTableScan(table=[[session, part]])
> LogicalProject(EXPR$0=[*(0.50, $0)])
> LogicalAggregate(group=[{}], agg#0=[SUM($0)])
> LogicalProject(l_quantity=[$4])
> LogicalFilter(condition=[AND(=($1, $cor0.ps_partkey), =($2,
> $cor0.ps_suppkey), SEARCH($10, Sarg[[1994-01-01..1995-01-01)]))])
> LogicalTableScan(table=[[session, lineitem]])
> -- Decorrelation via RemoveCorrelationForScalarAggregateRule
> LogicalProject(ps_partkey=[$0], ps_suppkey=[$1], ps_availqty=[$2],
> ps_supplycost=[$3], ps_comment=[$4], p_partkey=[$5], $f6=[*(0.50, $6)])
> LogicalAggregate(group=[{0, 1, 2, 3, 4, 5}], agg#0=[SUM($6)])
> LogicalProject(ps_partkey=[$0], ps_suppkey=[$1], ps_availqty=[$2],
> ps_supplycost=[$3], ps_comment=[$4], p_partkey=[$5], l_quantity=[$10])
> LogicalJoin(condition=[AND(=($7, $0), =($8, $1), SEARCH($16,
> Sarg[[1994-01-01..1995-01-01)]))], joinType=[left])
> LogicalJoin(condition=[=($0, $5)], joinType=[inner])
> LogicalTableScan(table=[[session, partsupp]])
> LogicalProject(p_partkey=[$0])
> LogicalFilter(condition=[LIKE($1, 'forest%':VARCHAR)])
> LogicalTableScan(table=[[session, part]])
> LogicalProject(l_orderkey=[$0], l_partkey=[$1], l_suppkey=[$2],
> l_linenumber=[$3], l_quantity=[$4], l_extendedprice=[$5], l_discount=[$6],
> l_tax=[$7], l_returnflag=[$8], l_linestatus=[$9], l_shipdate=[$10],
> l_commitdate=[$11], l_receiptdate=[$12], l_shipinstruct=[$13],
> l_shipmode=[$14], l_comment=[$15], nullIndicator=[true])
> LogicalTableScan(table=[[session, lineitem]])
> -- -- Decorrelation via main logic (without
> RemoveCorrelationForScalarAggregateRule)
> LogicalJoin(condition=[AND(=($0, $6), =($1, $7))], joinType=[left])
> LogicalJoin(condition=[=($0, $5)], joinType=[inner])
> LogicalTableScan(table=[[session, partsupp]])
> LogicalProject(p_partkey=[$0])
> LogicalFilter(condition=[LIKE($1, 'forest%':VARCHAR)])
> LogicalTableScan(table=[[session, part]])
> LogicalAggregate(group=[{0, 1}], agg#0=[SUM($2)])
> LogicalProject(l_partkey=[$1], l_suppkey=[$2], l_quantity=[$4])
> LogicalFilter(condition=[AND(SEARCH($10,
> Sarg[[1994-01-01..1995-01-01)]), IS NOT NULL($1), IS NOT NULL($2))])
> LogicalTableScan(table=[[session, lineitem]])
> {noformat}
> The idea of this ticket is to make configurable the rules used by
> {{{}RelDecorrelator#removeCorrelationViaRule{}}}. By default, everything will
> behave as before (the same "default" rules will be applied), so we keep
> backwards compatibility; but we shall offer new methods to allow
> RelDecorrelator's caller to tune the rules to be used in this step.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)