I see we have an arrow-testing repo already (although it seems to be mostly
empty). Would this be the correct place to create a PR to add test files?

On Sun, Jan 27, 2019 at 9:53 AM Wes McKinney <wesmck...@gmail.com> wrote:

> I'm in favor of using a submodule for testing data files to avoid
> bloating the git repository. So far this hasn't been too painful with
> the Parquet test data files
>
> On Sun, Jan 27, 2019 at 10:36 AM Andy Grove <andygrov...@gmail.com> wrote:
> >
> > That's a fair point about not needing a submodule... I was thinking about
> > converting some of the shared parquet files to CSV to help with testing
> > DataFusion. I guess I can just put them there for now and if other
> > implementations are interested we can just move them to a shared
> directory.
> >
> > Thanks,
> >
> > Andy.
> >
> > On Sun, Jan 27, 2019 at 9:31 AM Antoine Pitrou <anto...@python.org>
> wrote:
> >
> > >
> > > Well, CSV isn't a standard like Parquet is, meaning each implementation
> > > can choose their own middle grounds and interpretations.
> > >
> > > Also, the parquet-testing submodule exists because Parquet
> > > implementations are spread accross different repositories.  If we want
> a
> > > common location for CSV files accross Arrow implementations, we don't
> > > really need a submodule ;-)
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > Le 27/01/2019 à 17:28, Andy Grove a écrit :
> > > > I like the fact that we have a parquet-testing submodule that is
> shared
> > > > across implementations.  It there any interest in having an
> equivalent
> > > for
> > > > CSV files?
> > > >
> > > > Andy.
> > > >
> > >
>

Reply via email to