ctsk commented on PR #15479: URL: https://github.com/apache/datafusion/pull/15479#issuecomment-2783511665
@berkaysynnada I had a look at `ExecutionPlanProperties::pipeline_behavior()`. I think it is not *quite* what I want here: For the HashJoin, I want to remove the coalesce on the build side, but keep it on the probe side. The pipeline behaviour doesn't tell me which child is processed batch-wise, and which child is processed incrementally. I could add a blanket rule for other plans - potentially outside of datafusion repo - that removes the coalesce for each child of a plan that does not have EmissionType::Incremental. Unfortunately this does not cover the rule for the aggregation: Here I purposefully kept the CoalesceBatchesExec underneath partial aggregations, because those can switch to passing-through batches without aggregating. So far, this PR is a lot of reasoning what *might* make sense, but in the end it's down to measuring the impact for each operator. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org