[
https://issues.apache.org/jira/browse/CALCITE-6236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17813258#comment-17813258
]
Stamatis Zampetakis commented on CALCITE-6236:
----------------------------------------------
If the Filter is the one who throws off the estimation then maybe the fix
should be in that logic. Alternatively the {{EnumerableBatchNestedLoopJoin}}
estimation could be multiplied by some constant factor (may be in correlation
with the batch size) to rectify the underestimate. Adding more fields to the
join or traversing the tree to perform the estimation are not customary
solutions.
> EnumerableBatchNestedLoopJoin uses wrong row count for cost calculation
> -----------------------------------------------------------------------
>
> Key: CALCITE-6236
> URL: https://issues.apache.org/jira/browse/CALCITE-6236
> Project: Calcite
> Issue Type: Bug
> Reporter: Ulrich Kramer
> Priority: Major
> Labels: pull-request-available
>
> {{EnumerableBatchNestedLoopJoin}} always adds a {{Filter}} on the right
> relation.
> This filter reduces the number of rows by it's selectivity (in our case by a
> factor of 4).
> Therefore, {{RelMdUtil.getJoinRowCount}} returns a value 4 times lower
> compared to the one returned for a {{JdbcJoin}}.
> This leads to the fact that in most cases {{EnumerableBatchNestedLoopJoin}}
> is preferred over {{JdbcJoin}}.
> This is an example for the different costs
> {code}
> EnumerableProject rows=460.0 self_costs=460.0 cumulative_costs=1465.0
> EnumerableBatchNestedLoopJoin rows=460.0 self_costs=687.5
> cumulative_costs=1005.0
> JdbcToEnumerableConverter rows=100.0 self_costs=10.0
> cumulative_costs=190.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcToEnumerableConverter rows=25.0 self_costs=2.5 cumulative_costs=127.5
> JdbcFilter rows=25.0 self_costs=25.0 cumulative_costs=125.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
> vs.
> {code}
> JdbcToEnumerableConverter rows=1585.0 self_costs=158.5 cumulative_costs=2023.5
> JdbcJoin rows=1585.0 self_costs=1585.0 cumulative_costs=1865.0
> JdbcProject rows=100.0 self_costs=80.0 cumulative_costs=180.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> JdbcTableScan rows=100.0 self_costs=100.0 cumulative_costs=100.0
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)