ozankabak commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2939442458
I am onboard with the approach in this PR, and it looks good to me overall. Just needs some finishing touches: - To avoid such a large diff (which is mostly plans in tests), I suggest gating this behind a configuration flag, which would default to `false` for now. We will turn it on in subsequent PRs. We just need to add a few tests with this flag turned on to get it exercised in CI. - In the next PR, before we turn the flag on, we can add an API to query whether a leaf supports built-in yielding. If it does, the rule wouldn't add `YieldStreamExec` at all to that leaf. If we add built-in yielding support to a few common leaf nodes, and turn the configuration flag on after doing so, plan changes will be minimal. - In another subsequent PR, we can fix `InterleaveExec` (@berkaysynnada and I started discussing how one can approach this problem). - I suggest making the number of batches to yield not a constant but a parameter of `YieldStreamExec`. Also, when we display it, showing its child is not informative (we see that from the print-out already). Displaying yielding frequency is probably a better idea. Thanks @zhuqi-lucas for the great work. Love it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org