ashdnazg commented on PR #15654:
URL: https://github.com/apache/datafusion/pull/15654#issuecomment-2792174978

   @2010YOUY01 I checked your benchmark locally on my linux, Ryzen 7945HX, 3 
times on each version and got
   main: ~58s
   PR: ~57s
   which is not much better than noise.
   I also checked a version with the buffering done inside the stream using a 
tokio channel, which should reduce the `spawn_blocking` overhead, and that one 
got ~56s.
   Again, not a significant difference but the code is significantly more 
complicated and fragile.
   I pushed that version to 
https://github.com/ashdnazg/datafusion/tree/pull-batch-2 would be interesting 
to see its performance on the MacBook.
   
   I do worry that the benchmark might not measure the IO bottleneck accurately 
due to the OS caching the spill files.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to