adriangb commented on PR #12978:
URL: https://github.com/apache/datafusion/pull/12978#issuecomment-2542335627

   Hi @alamb I took a stab at fuzz tests in 969e83c82.
   They're heavy and slow so I had to restrict the search space a lot more than 
I would have liked. Maybe you or @tustvold can suggest ways to cut out the 
heavy parsing of Parquet metadata and such to speed these up? Ultimately I do 
think it's worth re-using whatever creates parquet stats from data so that we 
use the "real" thing but I don't think we need to test the serialization / 
deserialization repeatedly like this does.
   Also happy to restrict the search space by being more deliberate about how 
we build the values, row groups, predicates, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to