zhuqi-lucas commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2929748156
> [EmissionType](https://docs.rs/datafusion-physical-plan/47.0.0/datafusion_physical_plan/execution_plan/enum.EmissionType.html) is the closest to what I have in mind I think. Would it make sense to wrap the inputs of `ExecutionPlan`s that report `EmissionType::Final` as an approximation? A hypothetical `ConsumptionType` per child would allow only wrapping the build side of a hash join for instance. > > We're kind of still 'modifying' each operator, but in a declarative fashion rather than requiring the logic of each implementation to be updated. > @zhuqi-lucas, I think you are very close to a non-invasive solution that we can ship quickly 🚀 IIUC, this doesn't modify any operators, it just inserts a cancellation-friendly parent to all leaves. > > Three things come to mind as finishing touches: > > 1. Making the rule smarter, by utilizing `EmissionType` information, so that it only adds `YieldExec`s when necessary. If the path from a leaf to the root does not involve any operator that is pipeline-breaking, there is no need to insert a `YieldExec` as the parent of that leaf. I think this similar to @pepijnve' thinking. > 2. Having a configuration flag to sidestep this rule for users who don't want to get any (however small) performance penalty. > 3. Benchmarking to make sure performance penalty is small. > > Thanks for the awesome collaboration 💪 Thank you @pepijnve , @ozankabak , i will try to polish the code and make all CI tasks green! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org