ozankabak commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2934960971
@pepijnve this is a good summary of why I am against changing each operator individually: > IIRC yes -- and of few flavors. Sorting unconditionally suffers from this problem. Aggregation suffers from it when its input is unsorted. Windowing is prone too, but conditionally for some window frames. Joins will also conditionally suffer from this issue, if they collect one side fully. There are also other operators that behave this way, but in a data-dependent fashion (e.g. partial sorting). I am sure there are also others I can't think of right now. We know exactly when this sort of a yielding will be needed (thanks to the information exposed to the planner by the `ExecutionPlan` APIs). Therefore, if we are to solve this at the stream level, one thing we can contemplate is to change the stream object itself (which is used universally by all operators) to have two variants (one that encapsulates yielding logic and but introduces a small overhead, one that has no overhead but does not yield). This could be through a generic parameter, or an enum. Then, the planner can tweak the appropriate operators in the final plan through an `ExecutionPlan` API like `with_yielding_streams` to support yielding when necessary. This route would require some design iterations, and could have unintended consequences that I am failing to see right now. Solving the issue with `YieldStreamExec` in the meantime is still the best option I see for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org