davidhewitt commented on PR #14286: URL: https://github.com/apache/datafusion/pull/14286#issuecomment-2615483188
> This is neat -- I actually like that it is in terms of just a tokio runtime rather than somehthing like `DedicatedExecutor`. I might try and work on IoObjectStore to follow that pattern instead I think that makes sense to me too. When I see the current `IoObjectStore` containing a `DedicatedExecutor` and then calling `spawn_io`, the abstraction feels muddled. Having just a handle to a runtime to spawn works on seems simpler and generalises better. > In the interests of avoiding confusion, as my objections appear to have gotten a little misinterpreted, I'd like to clarify the fact this approach comes with non-trivial overheads is **not** what concerns me with this approach. Rather that we know from experience at InfluxData that this pattern is fragile, easy to mess up, and leads to emergent behaviour that is highly non-trivial to reproduce and debug. Just to put the brakes on rushing this into core too quickly, I have to support @tustvold in raising concern; we have had plenty of issues at Pydantic with this pattern where we have missed cases where we should have spawned IO (or CPU work) onto the other runtime. It seems to me like most folks agree research into schedulers for the CPU work that can sit within the single tokio runtime would be _much_ easier to integrate for downstream use cases, probably at the cost of complexity (overheads?) within datafusion itself. We have already vendored this pattern and have it working for us, you don't need to rush this through for us. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org