2010YOUY01 commented on code in PR #16804:
URL: https://github.com/apache/datafusion/pull/16804#discussion_r2212651111


##########
benchmarks/bench.sh:
##########
@@ -100,15 +100,24 @@ clickbench_pushdown:    ClickBench queries against 
partitioned (100 files) parqu
 clickbench_extended:    ClickBench \"inspired\" queries against a single 
parquet (DataFusion specific)
 
 # H2O.ai Benchmarks (Group By, Join, Window)
-h2o_small:              h2oai benchmark with small dataset (1e7 rows) for 
groupby,  default file format is csv
-h2o_medium:             h2oai benchmark with medium dataset (1e8 rows) for 
groupby, default file format is csv
-h2o_big:                h2oai benchmark with large dataset (1e9 rows) for 
groupby,  default file format is csv
-h2o_small_join:         h2oai benchmark with small dataset (1e7 rows) for 
join,  default file format is csv
-h2o_medium_join:        h2oai benchmark with medium dataset (1e8 rows) for 
join, default file format is csv
-h2o_big_join:           h2oai benchmark with large dataset (1e9 rows) for 
join,  default file format is csv
-h2o_small_window:       Extended h2oai benchmark with small dataset (1e7 rows) 
for window,  default file format is csv
-h2o_medium_window:      Extended h2oai benchmark with medium dataset (1e8 
rows) for window, default file format is csv
-h2o_big_window:         Extended h2oai benchmark with large dataset (1e9 rows) 
for window,  default file format is csv
+h2o_small:                      h2oai benchmark with small dataset (1e7 rows) 
for groupby,  default file format is csv

Review Comment:
   Later, we can clean it up with additional size/format options
   like
   `./bench.sh run h2o_join medium parquet`



##########
benchmarks/bench.sh:
##########
@@ -775,6 +840,7 @@ data_h2o() {
 
     # Set virtual environment directory
     VIRTUAL_ENV="${PWD}/venv"
+    rm -rf "$VIRTUAL_ENV"

Review Comment:
   Could you add a comment for this line?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to