alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2495013307
> At the risk of repeating myself from [datafusion-contrib/datafusion-dft#248 (comment)](https://github.com/datafusion-contrib/datafusion-dft/pull/248#issuecomment-2489110287) I would strongly discourage overloading the ObjectStore trait as some sort of IO/CPU boundary. I know you have said you have said you suggest doing something different, but I don't know how to translate your suggestions into actual code. I am pretty happy now that this PR illiustrates the core usecase of running DataFusion plans on a separate runtime/threadpool. If you can give me some hits on how to update the example in this PR to do what you have in mind I would be glad to try > Forcing every individual IO operation to be spawned to a separate runtime feels like the wrong solution to be encouraging. Instead DF should make this judgement call at a meaningful semantic boundary. In my mind the ObjectStore is both a meaningful and obvious semantic boundary (it is the IO abstraction used by DataFusion), so I don't fully understand this point. Also having all the IO on a separate threadpool I thought was best practice 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org