alamb commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2939494008
> @zhuqi-lucas great work. I've continued playing around with alternative structures in the meantime, but I keep coming back to your `YieldStream` as the most elegant solution. It's seems like the simplest solution that works well for both the plain `Stream` implementations and the implementations that wrap an async function. > > Looking at the current PR makes me less enthusiastic about `YieldStreamExec`. Still feels like an implementation detail leaking out to me. I'm thinking it might be useful to go through the various `Stream` implementations in parallel with your work and update them the way you originally did for `AggregateExec`. That will let us really compare the two alternatives we've been discussing so far. I'm happy to give that a try. I am sorry to make the conversation more complicated, but I agree with @pepijnve that I prefer the original implementation, where each `ExecutionPlan` instance is responsible for ensuring it yields appropriately (by adding a YieldStream, for example) While this does require more care implementing `ExecutionPlans` I think it is superior to the new `ExecutionPlan` because: 1. Only ExecutionPlans that don't have some other yield point are penalized with the yield. With `YieldExec` all plans get changed, even those that don't need it now will yield 2. The explain plan is now more complicated (and various optimizers that use `as_any` to check for types may need to be changed) While `YieldExec` is consistent with `CoalseceBatchesExec`, I have been hoping we can move away from `CoalesceBatchesExec` as well as it also is more of an implementation detail of filtering/repartitioning -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org