Independent of the particulars of the discussion, the C++ project
needs to be free to create a C API for itself. If you want to try to
block the C++ contributors from doing this we may be barreling toward
a governance crisis in the project. I'm stepping back from this
discussion for a time now to allow others to catch up on the
discussion and to weigh in as needed

On Mon, Jan 20, 2020 at 1:00 PM Jacques Nadeau <jacq...@apache.org> wrote:
>
> I don't see this as an endogenous concern of the C++ project. I appreciate
> your goal with saying so but I think this has broader ramifications around
> fragmentation of the project.
>
> The core challenge that we're dealing with is we introduced foundational
> concepts in some implementations that go beyond the spec and then provided
> useful features based on them (in this case, the offset concept). Ideally,
> those concepts are first introduced at the specification level so there
> aren't inconsistent viewpoints of what Arrow is (which I believe is what is
> happening here). Having a cross-language specification for in-memory
> processing is a new concept so it isn't surprising that we're going to
> learn these things along the way.
>
> Without this, we create a slippery slope of fragmentation between the
> specifications and the implementations. I understand that the toothpaste is
> out of the tube in this particular case. We can respond in two ways: stop
> the slip or continue to slide down the slope. I'm inclined to stop the slip.
>
> As I said on the GitHub, I'm struggling with how much of this should be
> solved in the project. I'm going to pause a bit on responding to reflect
> further about this as well to reduce the likelihood that this devolves into
> a flame war (which is always a risk with complex issues such as these).
>
>
>
> On Mon, Jan 20, 2020 at 9:59 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > hi Jacques,
> >
> > Taking a step back from the discussion, the original problem statement
> > was to enable third party projects to produce the data structure used
> > by C++ Array classes in C without depending on the C++ code
> >
> > That's the ArrayData class here
> >
> > https://github.com/apache/arrow/blob/master/cpp/src/arrow/array.h#L232
> >
> > It is important for us simplify the programming interface with the C++
> > library, so I think that we should address this as an endogenous
> > concern of the C++ project, namely providing a "C API for the C++
> > project". The C API for the C++ library needs to mirror what's in the
> > C++ project (i.e. the ArrayData data structure). We should not
> > advertise this as being a part of the project specification.
> >
> > - Wes
> >
> > On Mon, Jan 20, 2020 at 11:51 AM Jacques Nadeau <jacq...@apache.org>
> > wrote:
> > >
> > > As I noted on the pull request, I think fundamentally this work is at
> > odds
> > > with the Arrow specification and being used to introduce a shadow
> > > specification.
> > >
> > > I don't think our intentions about how people should use something really
> > > influence how people will actually use or perceive it. They'll just find
> > > supported Arrow code and expose things based on it and call it "Arrow
> > > compatible". In other words, I don't think people in the outside world
> > will
> > > be able to perceive the distinction between "Arrow C++ compatible" and
> > > "Arrow compatible".
> > >
> > > On Mon, Jan 20, 2020 at 9:28 AM Wes McKinney <wesmck...@gmail.com>
> > wrote:
> > >
> > > > hi folks,
> > > >
> > > > I just made a comment in https://github.com/apache/arrow/pull/6026
> > > > that I wanted to surface here on the mailing list.
> > > >
> > > > It seems that to reach consensus for a C interface that is intended to
> > > > be broadly used by multiple programming languages, we may make some
> > > > compromises that harm or outright undermine some of the use cases that
> > > > motivated the creation of the C interface in the first place. That
> > > > does not seem good. I wonder if it would be more productive to reduce
> > > > the scope of the project to merely providing a C-header-based data
> > > > interface to the C++ project only. That was the original problem
> > > > statement and it seems in attempting to make it useful beyond C++ has
> > > > made it difficult to reach consensus.
> > > >
> > > > Thanks
> > > > Wes
> > > >
> > > > On Sat, Dec 21, 2019 at 4:38 PM Jacques Nadeau <jacq...@apache.org>
> > wrote:
> > > > >
> > > > > Thanks for addressing my comments. I'm actively reviewing the
> > proposal.
> > > > It
> > > > > is taking me more time than I would like given the time of the year
> > but I
> > > > > want to make sure that you know that I'm looking at it and hope to
> > > > provide
> > > > > additional feedback beyond that which I've provided thus far on the
> > PR.
> > > > > Will update soon.
> > > > >
> > > > > Thanks for your patience.
> > > > >
> > > > > On Tue, Dec 17, 2019 at 11:16 AM Antoine Pitrou <solip...@pitrou.net
> > >
> > > > wrote:
> > > > >
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > Following Jacques's feedback, I drafted a new version of the C data
> > > > > > interface spec.
> > > > > >
> > > > > > The spec PR is here:
> > > > > > https://github.com/apache/arrow/pull/6040
> > > > > > Direct link to the RST file:
> > > > > >
> > > > > >
> > > >
> > https://github.com/apache/arrow/blob/5d8669d371401f9db12326b079e13c0058ba972b/docs/source/format/CDataInterface.rst
> > > > > >
> > > > > > There is also a C++ implementation, together with a Python <-> R
> > > > > > bridge demonstrating the functionality:
> > > > > > https://github.com/apache/arrow/pull/6026
> > > > > >
> > > > > > The main change from the previous spec is that there are now two C
> > > > > > structures; one for the type or schema information, one for the
> > > > > > array or record batch data. This allows exchanging both kinds of
> > > > > > information independently (and so, potentially, to exchange schema
> > once
> > > > > > and then multiple arrays or record batches).
> > > > > >
> > > > > > Comments and questions welcome.
> > > > > >
> > > > > > Regards
> > > > > >
> > > > > > Antoine.
> > > > > >
> > > > > >
> > > > > >
> > > >
> >

Reply via email to