pepijnve commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2935052670

   > We know exactly when this sort of a yielding will be needed thanks to the 
information exposed to the planner by the ExecutionPlan APIs
   
   > We know exactly when this sort of a yielding will be needed (thanks to the 
information exposed to the planner by the ExecutionPlan APIs). .... Then, the 
planner can tweak the appropriate operators in the final plan through an 
ExecutionPlan API like with_yielding_streams to support yielding when necessary.
   
   🤔 doesn't that boil down to the original proposal of adapting the operators 
as necessary? You end up exposing EmissionType as approximation of the streams 
on which it's necessary. Need to have the planner analyze this. Then you need 
to call a new method that each plan for which it's relevant needs to implement 
in a custom fashion. This feels like the long, complicated path to end up at 
the same thing as @zhuqi-lucas originally proposed where AggregateExec creates 
a specific stream when/if necessary. Why not just go direct?
   
   It feels like the largest concern is performance overhead. I think that's 
going to be negligible (to be proven by the benchmarks of course), but if it's 
not this could always be made optional via session config.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to