zhuqi-lucas commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2929748156

   > 
[EmissionType](https://docs.rs/datafusion-physical-plan/47.0.0/datafusion_physical_plan/execution_plan/enum.EmissionType.html)
 is the closest to what I have in mind I think. Would it make sense to wrap the 
inputs of `ExecutionPlan`s that report `EmissionType::Final` as an 
approximation? A hypothetical `ConsumptionType` per child would allow only 
wrapping the build side of a hash join for instance.
   > 
   > We're kind of still 'modifying' each operator, but in a declarative 
fashion rather than requiring the logic of each implementation to be updated.
   
   
   
   > @zhuqi-lucas, I think you are very close to a non-invasive solution that 
we can ship quickly 🚀 IIUC, this doesn't modify any operators, it just inserts 
a cancellation-friendly parent to all leaves.
   > 
   > Three things come to mind as finishing touches:
   > 
   > 1. Making the rule smarter, by utilizing `EmissionType` information, so 
that it only adds `YieldExec`s when necessary. If the path from a leaf to the 
root does not involve any operator that is pipeline-breaking, there is no need 
to insert a `YieldExec` as the parent of that leaf. I think this similar to 
@pepijnve' thinking.
   > 2. Having a configuration flag to sidestep this rule for users who don't 
want to get any (however small) performance penalty.
   > 3. Benchmarking to make sure performance penalty is small.
   > 
   > Thanks for the awesome collaboration 💪
   
   Thank you @pepijnve , @ozankabak , i will try to polish the code and make 
all CI tasks green!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to