adriangb commented on PR #12978:
URL: https://github.com/apache/datafusion/pull/12978#issuecomment-2563737914

   > Thanks @adriangb -- I think this PR is ready to go
   > 
   > One thing I noticed is that the fuzz test takes over a minute on my 
machine:
   > 
   > ```
   >         SLOW [> 60.000s] datafusion::fuzz 
fuzz_cases::pruning::test_fuzz_utf8
   >         PASS [  65.772s] datafusion::fuzz 
fuzz_cases::pruning::test_fuzz_utf8
   > ------------
   >      Summary [  72.749s] 47 tests run: 47 passed (1 slow), 0 skipped
   > andrewlamb@Mac:~/Software/datafusion$
   > ```
   > 
   > Is there some way to make it faster? Maybe with multiple threads or crank 
down the number of things to teset?
   
   Yeah this is what I was hinting at in 
https://github.com/apache/datafusion/pull/12978#issuecomment-2542335627.
   
   I'm happy to throw threads at it for a start, and restricting the search 
space might be necessary but I think requires a more careful eye to minimize 
how much valuable testing is discarded. The other thing that I think we can do 
is speed up the tests themselves, in particular minimizing unnecessary round 
trips to Parquet, but I'm not sure where the right places to hook in would be 
that still give us a realistic test but remove the need to re-parse the same 
data over and over again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to