adriangb commented on code in PR #16641: URL: https://github.com/apache/datafusion/pull/16641#discussion_r2180379264
########## datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs: ########## @@ -216,7 +218,28 @@ fn pushdown_sorts_helper( fn pushdown_requirement_to_children( plan: &Arc<dyn ExecutionPlan>, parent_required: OrderingRequirements, + parent_fetch: Option<usize>, ) -> Result<Option<Vec<Option<OrderingRequirements>>>> { + // Only attempt to push down TopK when there is an upstream LIMIT + if parent_fetch.is_some() { + // 1) Never push a new TopK below an operator that already has its own fetch + if plan.fetch().is_some() { + return Ok(None); + } + // 2) Only allow pushdown through operators that do not increase row count + // (equal or lower-equal cardinality). Any other operator (including joins, + // sort-with-limit, or UDTFs that may expand rows) must stop the pushdown. + let effect = plan.cardinality_effect(); + if !matches!( + effect, + CardinalityEffect::Equal | CardinalityEffect::LowerEqual + ) { + return Ok(None); + } Review Comment: Should this be ```suggestion let effect = plan.cardinality_effect(); if !matches!( plan.cardinality_effect(), CardinalityEffect::Equal ) { return Ok(None); } ``` ########## datafusion/physical-optimizer/src/enforce_sorting/sort_pushdown.rs: ########## @@ -216,7 +218,28 @@ fn pushdown_sorts_helper( fn pushdown_requirement_to_children( plan: &Arc<dyn ExecutionPlan>, parent_required: OrderingRequirements, + parent_fetch: Option<usize>, ) -> Result<Option<Vec<Option<OrderingRequirements>>>> { + // Only attempt to push down TopK when there is an upstream LIMIT + if parent_fetch.is_some() { Review Comment: Can we pass down `parent.cardinality_effect()` instead? I would do `if !matches(parent.cardinality_effect(), CardinalityEffect::Equal)` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org