alamb commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2939913792
> Why do you think this is the case? The rule only inserts YieldExecs if the plan involves pipeline-breaking, and only once at leaves. I think the penalty/overhead will always be strictly less in this approach. I missed the EmissionType::Final check in the PR -- this makes sense > This is true if we merge the PR directly as is and do not complete the work, but not if we add an API to ask whether a leaf supports built-in yielding. If it does, the rule wouldn't add YieldStreamExec at all to that leaf. If we add built-in yielding support to a few common leaf nodes, plan changes will be minimal while retaining the efficiency above. I think I missed this suggestion Thank you for the clarification -- so am I right in summarizing the final state as * `YieldExec` is inserted at the start of any ExecutionPlan unless it explicitly states (via some API) that it will yield internally? I think this is a nice design (sorry I didn't understand it at first) as allows: * Default, cancel safe behavior * Ability to optimize away stream / yield overhead for operators that already handle it internally -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org