alamb commented on code in PR #14225:
URL: https://github.com/apache/datafusion/pull/14225#discussion_r1924394309


##########
benchmarks/bench.sh:
##########
@@ -536,23 +536,52 @@ data_imdb() {
     done
 
     if [ "$convert_needed" = true ]; then
-        if [ ! -f "${imdb_dir}/imdb.tgz" ]; then
-            echo "Downloading IMDB dataset..."
+        # Expected size of the dataset

Review Comment:
   I tried running this locally on my mac and it `numfmt` seems not to be 
installed.  Is there any way to make it work without having to install a new 
program?
   
   ```shell
   andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ rm  
benchmarks/data/imdb/*.parquet
   andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ 
./benchmarks/bench.sh data imdb
   ***************************
   DataFusion Benchmark Runner and Data Generator
   COMMAND: data
   BENCHMARK: imdb
   DATA_DIR: /Users/andrewlamb/Software/datafusion/benchmarks/data
   CARGO_COMMAND: cargo run --release
   PREFER_HASH_JOIN: true
   ***************************
   Looking for imdb.tgz... found
   Checking size... ./benchmarks/bench.sh: line 551: numfmt: command not found
   OK ()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to