alamb commented on PR #15355:
URL: https://github.com/apache/datafusion/pull/15355#issuecomment-2749165158

   > > > 3. After we have collected 1MB of merged batch, one spill will be 
triggered. And this 1MB space will be cleared, the merging can continue.
   > > >    **Inefficency:** Now `ExternalSorter` will create a new spill file 
for those 1MB merged batches, after spilling all intermediates, all spilled 
files will be merged at once, then there are too many files to merge.
   > > >    **Ideal case:** All batches in a single sorted run can be 
incrementally appended to a single file.
   > > 
   > > 
   > > It seems to be a regression introduced by #14823.
   > 
   > That's true, so I feel obligated to fix it.
   
   @2010YOUY01  is this something that should be tracked with a follow on 
ticket?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to