pepijnve commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2931402946
> Inserting it as a parent of leaf nodes, and only when necessary (first item in my message above), gives us a system where the least number of necessary `YieldExec`s are inserted, and at a non-arbitrary place. I think this might still be insufficient for something like this. ``` Agg: Interleave: Agg: CoalesceBatches: Filter: Yield: Data Agg: Yield: Data ``` The filter + coalesce is there to ensure the `Pending`s coming from the `Yield`a are not emitted in perfect lockstep. What I think can happen here is that `CombinedRecordBatchStream` created by `Interleave` will end up ping-ponging between the two child aggregates when they return `Pending` and you still get stuck in the for loop of the parent `Aggregate`. I did not verify this yet, but it might be worth adding a test case for this type of situation as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org