Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-12 Thread via GitHub
alamb closed issue #15323: Reduce number of tokio blocking threads in SortExec spill URL: https://github.com/apache/datafusion/issues/15323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-12 Thread via GitHub
alamb closed issue #15323: Reduce number of tokio blocking threads in SortExec spill URL: https://github.com/apache/datafusion/issues/15323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-06 Thread via GitHub
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781673294 I created a draft PR with a solution, would appreciate your opinion: - #15608 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-06 Thread via GitHub
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781577131 > even if you use global tokio runtime and set the number of blocking threads to be a 1000 for example, there can be 1001 spill files. the problem is the same At some point

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-06 Thread via GitHub
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781460121 > > Comet currently creates a new tokio runtime per plan but there is a proposal to move to a global tokio runtime (per executor) instead. > > [apache/datafusion-com

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-06 Thread via GitHub
andygrove commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781458966 > I have a working version locally and will create a PR soon, just one problem, I don't think we can know the number of blocking threads tokio is configured with. > > t

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-06 Thread via GitHub
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2781454412 I have a working version locally and will create a PR soon, just one problem, I don't think I can know the number of blocking threads tokio is configured with. this is i

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-03 Thread via GitHub
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2776820608 > I think I have the the same problem but in `AggregateExec` when using `row_hash`, as it spills as well and use `SortPreservingMergeStream`. > > I think the solution should

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-04-03 Thread via GitHub
rluvaton commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2776605858 I think I have the the same problem but in `AggregateExec` when using `row_hash`, as it spills as well and use `SortPreservingMergeStream`. I think the solution should ac

[I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-03-24 Thread via GitHub
andygrove opened a new issue, #15323: URL: https://github.com/apache/datafusion/issues/15323 ### Is your feature request related to a problem or challenge? In Comet, we see some queries "hang" when running with minimal memory. The issue appears to be that we have hundreds of spill fil

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-03-22 Thread via GitHub
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2744217807 Makes sense -- with 183 spill files, we probably would need to merge in stages For example starting with 183 spill files 1. run 10 jobs, each merging about 10 files into

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-03-21 Thread via GitHub
andygrove commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2743795290 > Do you see too many threads when writing the spill files or when reading? This is when reading, during the merge operation. > In merge phase, each spill file wil

Re: [I] Reduce number of tokio blocking threads in SortExec spill [datafusion]

2025-03-20 Thread via GitHub
alamb commented on issue #15323: URL: https://github.com/apache/datafusion/issues/15323#issuecomment-2741967802 Do you see too many threads when writing the spill files or when reading? -- This is an automated message from the Apache Git Service. To respond to the message, please log on t