jonathanc-n commented on issue #16233: URL: https://github.com/apache/datafusion/issues/16233#issuecomment-2933126311
I agree with this, I've encountered some of these, especially this: > Streamline seeds. There are some hardcoded seeds, but it's unclear how to call a function with arguments to get the same error-inducing dataset. It'd help to include a hint on how to generate a test batch for debugging I'm thinking the plan would be to: 1. During failure we either set a environment variable to the seed and have it rerun, or print it out (or both) 2. Reproduce the dataset with the saved seed + save it to an arrow file Something like (example where seed = 1 failed): ``` FUZZ_SEED=1 cargo test fuzz_case_example``` 3. Have a reader for the arrow file 4. Add documentation to show how to do this in `fuzz_cases` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org