Kontinuation commented on issue #1523:
URL: 
https://github.com/apache/datafusion-comet/issues/1523#issuecomment-2735209828

   The query blocked because we don't have enough number of blocking threads 
configured for the tokio runtime.
   
   In merge phase, each spill file will be wrapped by a stream backed by a 
blocking thread (see 
[read_spill_as_stream](https://github.com/apache/datafusion/blob/46.0.1/datafusion/physical-plan/src/spill.rs#L44-L55)),
 so we'll spawn at least 183 blocking threads when there are 183 spill files to 
merge spilled data. The default number of blocking thread is 10, this make the 
query hang indefinitely.
   
   Tuning `spark.comet.blockingThreads` to a higher value could resolve this 
problem. We may consider raising the default value of 
`spark.comet.blockingThreads`, or improving sort-merge in datafusion to not 
spawning so many blocking threads.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to