I agree that requiring addition to a "complete" implementation would be
unfortunate, if only because a cursory glance at [1] shows that there
aren't any that implement the entire specification anyway. I don't think
this should preclude adding new array types, although it might give us
cause to pause before adding many more...
I personally think requiring at least two native implementations,
accompanying integration tests, and a formal vote should be sufficient.
This will serve to both bring visibility to the proposal, ensure it is
tractable, and that there are no objections to the proposal. Ultimately
I think it is fine for implementations to differ in the feature set they
implement, provided it is clearly communicated, as I think [1] does very
well. Users are then able to make an informed judgement call as to
whether they wish to use a given feature based on its adoption.
[1]: https://arrow.apache.org/docs/status.html
On 11/01/2023 21:11, Brian Hulette wrote:
I think this [1] is the thread where the policy was proposed, but it
doesn't look like we ever settled on "Java and C++" vs. "any two
implementations", or had a vote.
I worry that requiring maintainers to add new format features to two
"complete" implementations will just lead to fragmentation. People might
opt to maintain a fork rather than unblock themselves by implementing a
backlog of features they don't need.
[1] https://lists.apache.org/thread/9t0pglrvxjhrt4r4xcsc1zmgmbtr8pxj
On Fri, Jan 6, 2023 at 12:33 PM Weston Pace <weston.p...@gmail.com> wrote:
I think it would be reasonable to state that a reference
implementation must be a complete implementation (i.e. supports all
existing types) that is not derived from another implementation (e.g.
you can't pick pyarrow and arrow-c++). If an implementation does not
plan on ever supporting a new array type then maintainers of that
implementation should be empowered to vote against it. Given that, it
seems like a reasonable burden to ask maintainers to catch up first
before expanding in new directions.
On Fri, Jan 6, 2023 at 10:20 AM Micah Kornfield <emkornfi...@gmail.com>
wrote:
Note this wording talks about "two reference implementations" not
"*the*
two reference implementations". So there can be more than two reference
implementations.
Maybe reference implementation is the wrong wording here. My main
concern
is that we try to maintain two "feature complete" implementations at all
times. I worry if there is a pick 2 from N reference implementations
that
potentially leads to fragmentation more quickly. But maybe this is
premature?
Cheers,
Micah
On Fri, Jan 6, 2023 at 10:02 AM Antoine Pitrou <anto...@python.org>
wrote:
Le 06/01/2023 à 18:58, Micah Kornfield a écrit :
I'm having trouble finding it, but I think we've previously agreed
that
new
features needed implementations in 2 reference implementations before
approval (I had thought the community agreed on Java and C++ as the
two
implementations but I can't find the vote thread on it).
Note this wording talks about "two reference implementations" not
"*the*
two reference implementations". So there can be more than two reference
implementations.
Regards
Antoine.