2010YOUY01 commented on PR #15610:
URL: https://github.com/apache/datafusion/pull/15610#issuecomment-2791500223

   Thank you all for the review!
   
   @qstommyshu I agree with the implementation-level feedbacks. I will address 
them in the refactor.
   
   @alamb Regarding parallel merging: I was thinking if 
`max_spill_perge_degree` configured to 10, than the memory is limited so that 
in each partition, we can only hold 10 batches at the same time, so parallel 
merging is not possible in this case.
   However, @rluvaton 's PR has inspired me that, it's possible each operator 
is able to hold 100 batches under the memory limit at the same time, but we 
might still want to merge them 10 at a time for performance.
   
   I think the next steps are
   1. Contribute benchmarks for external sort.
   2. Refactor this PR to avoid always re-spill, also do parallel merging when 
possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to