[
https://issues.apache.org/jira/browse/CALCITE-7173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated CALCITE-7173:
------------------------------------
Labels: pull-request-available (was: )
> Improve RelMdDistinctRowCount estimation for lossless casts
> -----------------------------------------------------------
>
> Key: CALCITE-7173
> URL: https://issues.apache.org/jira/browse/CALCITE-7173
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.40.0
> Reporter: Alessandro Solimando
> Assignee: Alessandro Solimando
> Priority: Minor
> Labels: pull-request-available
>
> Consider the following test for _RelMetadataTest_:
> {code:java}
> @Test
> void testAggregateDistinctRowCountLosslessCast() {
> final String values = "values ('b', 10), ('b', 20), ('b', 30)";
> final String sql =
> "select name, cast(sal as varchar(11)) from (" + values + ") t(name,
> sal) " +
> "group by name, cast(sal as varchar(11))";
> sql(sql).assertThatDistinctRowCount(bitSetOf(1), is(3d));
> }
> {code}
> The test currently fails as follows:
> {noformat}
> Expected: is <3.0>
> but: was <1.6439107033725735>
> {noformat}
> For lossless casts (and in general for injective functions), one would expect
> "NDV(CAST($i)) = NDV($i)" to hold.
> A minimal fix would enhance
> [RelMdUtil.java#L596|https://github.com/apache/calcite/blob/calcite-1.40.0/core/src/main/java/org/apache/calcite/rel/metadata/RelMdUtil.java#L596]
> to consider lossless casts as references to input fields, since it's only
> used in
> [RelMdDistinctRowCount|https://github.com/apache/calcite/blob/calcite-1.40.0/core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java#L258]
> and with the same exact spirit in
> [RelMdPopulationSize|https://github.com/apache/calcite/blob/calcite-1.40.0/core/src/main/java/org/apache/calcite/rel/metadata/RelMdPopulationSize.java#L138].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)