alamb commented on PR #16196:
URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2939494008

   > @zhuqi-lucas great work. I've continued playing around with alternative 
structures in the meantime, but I keep coming back to your `YieldStream` as the 
most elegant solution. It's seems like the simplest solution that works well 
for both the plain `Stream` implementations and the implementations that wrap 
an async function.
   > 
   > Looking at the current PR makes me less enthusiastic about 
`YieldStreamExec`. Still feels like an implementation detail leaking out to me. 
I'm thinking it might be useful to go through the various `Stream` 
implementations in parallel with your work and update them the way you 
originally did for `AggregateExec`. That will let us really compare the two 
alternatives we've been discussing so far. I'm happy to give that a try.
   
   I am sorry to make the conversation more complicated, but I agree with 
@pepijnve that I prefer the original implementation, where each `ExecutionPlan` 
instance is responsible for ensuring it yields appropriately (by adding a 
YieldStream, for example)
   
   While this does require more care implementing `ExecutionPlans` I think it 
is superior to the new `ExecutionPlan` because:
   1. Only ExecutionPlans that don't have some other yield point are penalized 
with the yield. With `YieldExec` all plans get changed, even those that don't 
need it now will yield
   2. The explain plan is now more complicated (and various optimizers that use 
`as_any` to check for types may need to be changed)
   
   While `YieldExec` is consistent with `CoalseceBatchesExec`, I have been 
hoping we can move away from `CoalesceBatchesExec` as well as it also is more 
of an implementation detail of filtering/repartitioning


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to