ashdnazg commented on PR #15654: URL: https://github.com/apache/datafusion/pull/15654#issuecomment-2792174978
@2010YOUY01 I checked your benchmark locally on my linux, Ryzen 7945HX, 3 times on each version and got main: ~58s PR: ~57s which is not much better than noise. I also checked a version with the buffering done inside the stream using a tokio channel, which should reduce the `spawn_blocking` overhead, and that one got ~56s. Again, not a significant difference but the code is significantly more complicated and fragile. I pushed that version to https://github.com/ashdnazg/datafusion/tree/pull-batch-2 would be interesting to see its performance on the MacBook. I do worry that the benchmark might not measure the IO bottleneck accurately due to the OS caching the spill files. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org