timsaucer commented on issue #696: URL: https://github.com/apache/datafusion-python/issues/696#issuecomment-2117480476
Ok, made some good progress on this. Will try to wrap up tomorrow. https://github.com/timsaucer/datafusion-python/blob/tsaucer/prepare_tpch_examples_for_ci/examples/tpch/_tests.py I do like your idea of switching this to use `pytest-snapshot` so I'll probably look at that. I *am* adding a small set of reduced reference data, which is total of about 1.1 Mb. This will allow users to run the examples without requiring them to generate a full TPC-H data set, but if they want to reproduce the spec result they will have to run the data generator. This is also such a small data set it'll speed up CI by not having to run generator every time. Most importantly, if we don't have a reference data set in the repo it would require all users to run the generator just to get their pyunit tests to pass, which I don't think people would like. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
