Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2025-01-25 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2613971547 I made another version of this example here (with no changes to the DataFusion core). Let's move any further discussion there: - https://github.com/apache/datafusion/pull/14286 --

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2025-01-25 Thread via GitHub
alamb closed pull request #13424: Add example for using a separate threadpool for CPU bound work URL: https://github.com/apache/datafusion/pull/13424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2025-01-10 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2582869825 This PR appears to have stalled -- it seems we are not ready to commit to this kind of wrapping in the main DataFusion crate but we also don't have any plausible alternative. T

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-12-08 Thread via GitHub
tustvold commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2526370412 So I had a brief play and actually ended up wondering if spawning IO is even the right approach to this problem... I wrote up some thoughts on https://github.com/apache/datafusion/i

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-12-08 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2526272457 @djanderson -- do you have some sort of reproducable program that produces the > I'll try to get something to show how you can do this up FWIW I understand in theory how

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-12-08 Thread via GitHub
tustvold commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2526270388 I'll try to get something to show how you can do this up -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-12-08 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2526267640 Update: * Here is an alternate attempt following @tustvold 's suggestion: https://github.com/apache/datafusion/pull/13690 * I can't figure out how to make it work practically, tho

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-12-06 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2524499072 Update here is I hacked up the alternate approproach (annotating all locations in DataFusion that use a different threadpool) on the plane. It didn't go great but I will make a PR tomo

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-24 Thread via GitHub
berkaysynnada commented on code in PR #13424: URL: https://github.com/apache/datafusion/pull/13424#discussion_r1855421673 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,213 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-23 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2495429782 > > I appreciate this is a more intrusive approach, but I don't really think DataFusion can continue to leave this sort of thing as an exercise for the reader, especially given the iss

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
djanderson commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2495173120 > I appreciate this is a more intrusive approach, but I don't really think DataFusion can continue to leave this sort of thing as an exercise for the reader, especially given the

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
tustvold commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2495131293 > but I don't know how to translate your suggestions into actual code The basic idea is rather than shoehorning the runtime dispatch into the ObjectStore trait, instead make t

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2495013307 > At the risk of repeating myself from [datafusion-contrib/datafusion-dft#248 (comment)](https://github.com/datafusion-contrib/datafusion-dft/pull/248#issuecomment-2489110287) I would

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
matthewmturner commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2494661569 I had the impression that this example was for illustration purposes for what it would look like to have fully separate io and cpu runtimes - although not the desired end stat

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
tustvold commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2494641789 At the risk of repeating myself from https://github.com/datafusion-contrib/datafusion-dft/pull/248#issuecomment-2489110287 I would strongly discourage overloading the ObjectStore tr

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
matthewmturner commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2494628563 @alamb perfect makes sense. I just wasn't originally clear what you meant by running it on the dedicated executor. -- This is an automated message from the Apache Git Servi

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2494584685 > Can you expand on point 1? My naive expectation was that all network io went through the main runtime. Yes, that is what should happen. The problem is here. As written, calling

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
matthewmturner commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2494530521 > Ok, I am pretty happy with where this PR is now. It shows the entire process running end to end, with the `DedicatedExecutor` and running always on the dedicated executor

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
alamb commented on PR #13424: URL: https://github.com/apache/datafusion/pull/13424#issuecomment-2494506254 Ok, I am pretty happy with where this PR is now. It shows the entire process running end to end, with the `DedicatedExecutor` and running always on the dedicated executor Things

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
alamb commented on code in PR #13424: URL: https://github.com/apache/datafusion/pull/13424#discussion_r1854397735 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,206 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-22 Thread via GitHub
matthewmturner commented on code in PR #13424: URL: https://github.com/apache/datafusion/pull/13424#discussion_r1854313380 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,206 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-21 Thread via GitHub
alamb commented on code in PR #13424: URL: https://github.com/apache/datafusion/pull/13424#discussion_r1851961892 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,207 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

Re: [PR] Add example for using a separate threadpool for CPU bound work [datafusion]

2024-11-21 Thread via GitHub
crepererum commented on code in PR #13424: URL: https://github.com/apache/datafusion/pull/13424#discussion_r1851840831 ## datafusion-examples/examples/thread_pools.rs: ## @@ -0,0 +1,207 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor lic