timsaucer commented on issue #696:
URL: 
https://github.com/apache/datafusion-python/issues/696#issuecomment-2117480476

   Ok, made some good progress on this. Will try to wrap up tomorrow. 
https://github.com/timsaucer/datafusion-python/blob/tsaucer/prepare_tpch_examples_for_ci/examples/tpch/_tests.py
   
   I do like your idea of switching this to use `pytest-snapshot` so I'll 
probably look at that.
   
   I *am* adding a small set of reduced reference data, which is total of about 
1.1 Mb. This will allow users to run the examples without requiring them to 
generate a full TPC-H data set, but if they want to reproduce the spec result 
they will have to run the data generator. This is also such a small data set 
it'll speed up CI by not having to run generator every time. Most importantly, 
if we don't have a reference data set in the repo it would require all users to 
run the generator just to get their pyunit tests to pass, which I don't think 
people would like.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to