alamb commented on PR #15981: URL: https://github.com/apache/datafusion/pull/15981#issuecomment-2867285850
> I think something like that is done already in the "convert to state" logic - it will dynamically decide to skip aggregating once it sees that the group vs input rows ratio is small. I agree Specifically https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.GroupsAccumulator.html#method.convert_to_state and similar functions These [config value thresholds ](https://datafusion.apache.org/user-guide/configs.html)control the behavior: datafusion.execution.skip_partial_aggregation_probe_ratio_threshold | 0.8 | Aggregation ratio (number of distinct groups / number of input rows) threshold for skipping partial aggregation. If the value is greater then partial aggregation will skip aggregation for further input -- | -- | -- datafusion.execution.skip_partial_aggregation_probe_rows_threshold | 100000 | Number of input rows partial aggregation partition should process, before aggregation ratio check and trying to switch to skipping aggregation mode datafusion.execution.use_row_number_estimates_to_optimize_partitioning | false | Should DataFusion use row number estimates at the input to decide whether increasing parallelism is beneficial or not. By default, only exact row numbers (not estimates) are used for this decision. Setting this flag to true will likely produce better plans. if the source of statistics is accurate. We plan to make this the default in the future. <br class="Apple-interchange-newline"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org