alamb commented on code in PR #14820: URL: https://github.com/apache/datafusion/pull/14820#discussion_r1966797142
########## benchmarks/README.md: ########## @@ -243,28 +244,92 @@ The `dfbench` program contains subcommands to run the various benchmarks. When benchmarking, it should always be built in release mode using `--release`. -Full help for each benchmark can be found in the relevant sub -command. For example to get help for tpch, run +Full help for each benchmark can be found in the relevant +subcommand. For example, to get help for tpch, run: ```shell -cargo run --release --bin dfbench --help +cargo run --release --bin dfbench -- tpch --help ... -datafusion-benchmarks 27.0.0 -benchmark command +dfbench-tpch 45.0.0 +Run the tpch benchmark. + +This benchmarks is derived from the [TPC-H][1] version +[2.17.1]. The data and answers are generated using `tpch-gen` from +[2]. + +[1]: http://www.tpc.org/tpch/ +[2]: https://github.com/databricks/tpch-dbgen.git, +[2.17.1]: https://www.tpc.org/tpc_documents_current_versions/pdf/tpc-h_v2.17.1.pdf USAGE: - dfbench <SUBCOMMAND> + dfbench tpch [FLAGS] [OPTIONS] --path <path> + +FLAGS: + -d, --debug + Activate debug mode to see more details -SUBCOMMANDS: - clickbench Run the clickbench benchmark - help Prints this message or the help of the given subcommand(s) - parquet-filter Test performance of parquet filter pushdown - sort Test performance of parquet filter pushdown - tpch Run the tpch benchmark. - tpch-convert Convert tpch .slt files to .parquet or .csv files + -S, --disable-statistics + Whether to disable collection of statistics (and cost based optimizations) or not + -h, --help + Prints help information +... ``` +# Writing a new benchmark + +## Creating or downloading data outside of the benchmark + +If you want to create or download the data with Rust as part of running the benchmark, see the next +section on adding a benchmark subcommand and add code to create or download data as part of its +`run` function. + +If you want to create or download the data with shell commands, in `benchmarks/bench.sh`, define a Review Comment: thank you -- this content is great @carols10cents I worry it will get out of sync being in a different file. However, since it involves editing multiple files, I can't think of any place better to put it 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org