[
https://issues.apache.org/jira/browse/CALCITE-7365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated CALCITE-7365:
------------------------------------
Labels: pull-request-available (was: )
> RelMdRowCount ignores estimateRowCount() overrides in SingleRel's subclasses
> ----------------------------------------------------------------------------
>
> Key: CALCITE-7365
> URL: https://issues.apache.org/jira/browse/CALCITE-7365
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.41.0
> Reporter: Alessandro Solimando
> Assignee: Alessandro Solimando
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.42.0
>
>
> The _SingleRel_ handler in _RelMdRowCount_
> ([here|https://github.com/apache/calcite/blob/506950a3ebd4807b36901bededc64f7c60497712/core/src/main/java/org/apache/calcite/rel/metadata/RelMdRowCount.java#L194-L197])
> always returns the input's row count, ignoring any _estimateRowCount()_
> override in subclasses:
> {code:java}
> public @Nullable Double getRowCount(SingleRel rel, RelMetadataQuery mq) {
> return mq.getRowCount(rel.getInput());
> }
> {code}
>
> This makes it impossible for custom _SingleRel_ operators to provide accurate
> row count estimates without implementing a custom metadata handler.
> The _RelNode_ catch-all handler
> ([here|https://github.com/apache/calcite/blob/506950a3ebd4807b36901bededc64f7c60497712/core/src/main/java/org/apache/calcite/rel/metadata/RelMdRowCount.java#L64-L72])
> correctly delegates to {_}estimateRowCount(){_}:
> {code:java}
> public @Nullable Double getRowCount(RelNode rel, RelMetadataQuery mq) {
> return rel.estimateRowCount(mq);
> }
> {code}
> The _SingleRel_ handler should do the same for consistency.
> Reproducer (for RelMetadataTest test file):
>
> {code:java}
> private static class ExpandingRel extends SingleRel {
> private static final double EXPANSION_FACTOR = 10.0;
> ExpandingRel(RelOptCluster cluster, RelTraitSet traits, RelNode input) {
> super(cluster, traits, input);
> }
> @Override public double estimateRowCount(RelMetadataQuery mq) {
> return mq.getRowCount(input) * EXPANSION_FACTOR;
> }
> }
> @Test void testRowCountCustomSingleRel() {
> final RelNode scan = sql("select * from emp").toRel();
> final ExpandingRel expanding =
> new ExpandingRel(scan.getCluster(), scan.getTraitSet(), scan);
> final RelMetadataQuery mq = scan.getCluster().getMetadataQuery();
> final Double rowCount = mq.getRowCount(expanding);
> // Returns 14.0 (input row count) instead of 140.0 (input * 10)
> assertThat(rowCount, is(EMP_SIZE * 10));
> }
> {code}
>
> Fix:
> Change the _SingleRel_ handler to delegate to {_}estimateRowCount(){_}:
> {code:java}
> public @Nullable Double getRowCount(SingleRel rel, RelMetadataQuery mq) {
> return rel.estimateRowCount(mq);
> }{code}
>
> This is backward compatible since _SingleRel.estimateRowCount()_ already
> returns _mq.getRowCount(input)_ (see
> [here|https://github.com/apache/calcite/blob/506950a3ebd4807b36901bededc64f7c60497712/core/src/main/java/org/apache/calcite/rel/SingleRel.java#L66-L69]).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)