alamb commented on PR #13681: URL: https://github.com/apache/datafusion/pull/13681#issuecomment-2619955284
Sorry for the delay @Rachelint . I was having trouble with the benchmark queries Here are my benchmark results -- not bad :bowtie: almost 7x faster for our extended clickbench query: https://github.com/apache/datafusion/blob/d05173118128f8993138a4b7d21b4a76de380756/benchmarks/queries/clickbench/extended.sql#L5 And the actual h2o benchmark (which is dominated by CSV parsing) also shows a noticeable 1.6x improvement https://github.com/apache/datafusion/blob/d05173118128f8993138a4b7d21b4a76de380756/benchmarks/queries/h2o/groupby.sql#L6 ``` -------------------- Benchmark clickbench_extended.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ main_base ┃ impl-group-accumulator-for-medi… ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 0 │ 2753.78ms │ 2717.63ms │ no change │ │ QQuery 1 │ 828.33ms │ 736.61ms │ +1.12x faster │ │ QQuery 2 │ 1621.34ms │ 1507.86ms │ +1.08x faster │ │ QQuery 3 │ 734.73ms │ 739.59ms │ no change │ │ QQuery 4 │ 12552.79ms │ 1823.32ms │ +6.88x faster │ │ QQuery 5 │ 19545.52ms │ 19039.51ms │ no change │ └──────────────┴────────────┴──────────────────────────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (main_base) │ 38036.49ms │ │ Total Time (impl-group-accumulator-for-median) │ 26564.53ms │ │ Average Time (main_base) │ 6339.41ms │ │ Average Time (impl-group-accumulator-for-median) │ 4427.42ms │ │ Queries Faster │ 3 │ │ Queries Slower │ 0 │ │ Queries with No Change │ 3 │ └──────────────────────────────────────────────────┴────────────┘ -------------------- Benchmark h2o.json -------------------- ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃ Query ┃ main_base ┃ impl-group-accumulator-for-medi… ┃ Change ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ QQuery 1 │ 2207.34ms │ 2168.72ms │ no change │ │ QQuery 2 │ 5739.73ms │ 5757.47ms │ no change │ │ QQuery 3 │ 4330.60ms │ 4332.62ms │ no change │ │ QQuery 4 │ 3008.24ms │ 2998.72ms │ no change │ │ QQuery 5 │ 4108.77ms │ 4072.95ms │ no change │ │ QQuery 6 │ 6834.05ms │ 4160.14ms │ +1.64x faster │ │ QQuery 7 │ 4059.20ms │ 4019.71ms │ no change │ │ QQuery 8 │ 8013.99ms │ 8108.63ms │ no change │ │ QQuery 9 │ 10774.38ms │ 10642.69ms │ no change │ │ QQuery 10 │ 8018.83ms │ 7916.58ms │ no change │ └──────────────┴────────────┴──────────────────────────────────┴───────────────┘ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ ┃ Benchmark Summary ┃ ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ │ Total Time (main_base) │ 57095.14ms │ │ Total Time (impl-group-accumulator-for-median) │ 54178.23ms │ │ Average Time (main_base) │ 5709.51ms │ │ Average Time (impl-group-accumulator-for-median) │ 5417.82ms │ │ Queries Faster │ 1 │ │ Queries Slower │ 0 │ │ Queries with No Change │ 9 │ └──────────────────────────────────────────────────┴────────────┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org