alamb commented on PR #16398: URL: https://github.com/apache/datafusion/pull/16398#issuecomment-2991878307
> Zooming in on query 4 (it would be super convenient if these had 1-based indices BTW to match line numbers in the file). Yes i also find the ClickHouse numbering schemes confusing (we follow the https://github.com/ClickHouse/ClickBench convention) Thank you @pepijnve -- I have reviewed your analysis and agree with your conclusion (that the reported differences are likely just noise) To double check I tried to reproduce the results reported manually locally │ QQuery 4 │ 614.68 ms │ 702.11 ms │ 1.14x slower │ I used a decided unscientific approach ```shell $ cat q4.sql SELECT COUNT(DISTINCT "UserID") FROM hits; ``` ```shell $ datafusion-cli -f q4.sql | grep Elapsed ``` Results on `merge-base` of this PR ``` Elapsed 0.311 seconds. Elapsed 0.287 seconds. Elapsed 0.292 seconds. Elapsed 0.294 seconds. ``` Results on this PR ``` Elapsed 0.294 seconds. Elapsed 0.301 seconds. Elapsed 0.293 seconds. Elapsed 0.287 seconds. ``` This I conclude there is no appreciable difference and we should merge this PR. I'll plan to do so after we get a clean CI run (I'll merge up to fix conflicts too) The Power vs Efficiency cores is a great (and fascinating) observation -- and one that I think deserves further study. I'll file another ticket to discuss that -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org