davidhewitt commented on PR #14286:
URL: https://github.com/apache/datafusion/pull/14286#issuecomment-2615483188

   > This is neat -- I actually like that it is in terms of just a tokio 
runtime rather than somehthing like `DedicatedExecutor`. I might try and work 
on IoObjectStore to follow that pattern instead
   
   I think that makes sense to me too. When I see the current `IoObjectStore` 
containing a `DedicatedExecutor` and then calling `spawn_io`, the abstraction 
feels muddled. Having just a handle to a runtime to spawn works on seems 
simpler and generalises better.
   
   > In the interests of avoiding confusion, as my objections appear to have 
gotten a little misinterpreted, I'd like to clarify the fact this approach 
comes with non-trivial overheads is **not** what concerns me with this 
approach. Rather that we know from experience at InfluxData that this pattern 
is fragile, easy to mess up, and leads to emergent behaviour that is highly 
non-trivial to reproduce and debug.
   
   Just to put the brakes on rushing this into core too quickly, I have to 
support @tustvold in raising concern; we have had plenty of issues at Pydantic 
with this pattern where we have missed cases where we should have spawned IO 
(or CPU work) onto the other runtime.
   
   It seems to me like most folks agree research into schedulers for the CPU 
work that can sit within the single tokio runtime would be _much_ easier to 
integrate for downstream use cases, probably at the cost of complexity 
(overheads?) within datafusion itself.
   
   We have already vendored this pattern and have it working for us, you don't 
need to rush this through for us.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to