zhuqi-lucas commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2948429253
@pepijnve I don’t expect either placement of YieldExec to materially affect throughput—most CPU/IO cost still comes from predicate evaluation or hash‐join builds. The variability you’re seeing in PR 16262’s benchmarks is therefore probably just noise (e.g. OS scheduling, JIT warmup) rather than a real performance regression. > @zhuqi-lucas @alamb I wanted to work on measuring the performance impact of this PR today, but looking at [#16262 (review)](https://github.com/apache/datafusion/pull/16262#pullrequestreview-2903139531) I'm a bit surprised to see similar variability being reported as I'm seeing in the benchmark results of my own experimental branch. For PR 16262 no changes were made to the production code but you still see performance deltas. This begs the question how the benchmarks can/should be used to evaluate code changes. How are you guys making use of these results? Should I be using something else for coarse grained performance impact assessment? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org