I see we have an arrow-testing repo already (although it seems to be mostly empty). Would this be the correct place to create a PR to add test files?
On Sun, Jan 27, 2019 at 9:53 AM Wes McKinney <wesmck...@gmail.com> wrote: > I'm in favor of using a submodule for testing data files to avoid > bloating the git repository. So far this hasn't been too painful with > the Parquet test data files > > On Sun, Jan 27, 2019 at 10:36 AM Andy Grove <andygrov...@gmail.com> wrote: > > > > That's a fair point about not needing a submodule... I was thinking about > > converting some of the shared parquet files to CSV to help with testing > > DataFusion. I guess I can just put them there for now and if other > > implementations are interested we can just move them to a shared > directory. > > > > Thanks, > > > > Andy. > > > > On Sun, Jan 27, 2019 at 9:31 AM Antoine Pitrou <anto...@python.org> > wrote: > > > > > > > > Well, CSV isn't a standard like Parquet is, meaning each implementation > > > can choose their own middle grounds and interpretations. > > > > > > Also, the parquet-testing submodule exists because Parquet > > > implementations are spread accross different repositories. If we want > a > > > common location for CSV files accross Arrow implementations, we don't > > > really need a submodule ;-) > > > > > > Regards > > > > > > Antoine. > > > > > > > > > Le 27/01/2019 à 17:28, Andy Grove a écrit : > > > > I like the fact that we have a parquet-testing submodule that is > shared > > > > across implementations. It there any interest in having an > equivalent > > > for > > > > CSV files? > > > > > > > > Andy. > > > > > > > >