Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-27 Thread Leif Walsh
I think Wes' idea that major versions indicate stability of the spec and minor versions indicate stability of each implementation's API makes sense. With that in mind, maybe before 1.0 of the spec we should just establish, within each of the reference language implementations, a mechanism for speci

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-27 Thread Julian Hyde
Semantic versioning is a great tool, and we should use it as far as it goes, but not push it. I suggest that the Arrow specification should have a paragraph that states the level of maturity of each part of the API; and each implementation should have a paragraph that states which parts of the spe

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
The combinatorics of code-level API stability are worrisome (with already 5 different language APIs in the project) while the maturity and development pace of different implementations may remain variable for some time. There are two possible things we can communicate with some form of major versi

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
I agree with all that. But semantic versioning only pertains to public APIs. So, for it to work, you need to declare what are your public APIs. If you don’t, people will make assumptions about what are your public APIs, and they may get it wrong. The ability to add experimental APIs (not subjec

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
Yes, definitely, sorry to not make that more clear. As part of this process we should draw up a documentation page about how to interpret the version numbers as a third party user, and how we will handle documenting experimental features. For example, we might add an experimental new logical type a

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
It sounds as if you agree with me: It is very important that we clearly state which bits of Arrow are fixed and which bits are not. > On Jul 26, 2017, at 11:56 AM, Wes McKinney wrote: > > Given the nature of the Arrow project, where any number of different > implementations will be in flux at a

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
I see the semantic versioning like this: Major version: Format and Metadata stability Minor version: API stability within fix versions Fix version: Bug fixes So an API might be deprecated from 1.0.0 to 1.1.0, but we could not make a breaking change to the memory format without increasing the majo

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
Given the nature of the Arrow project, where any number of different implementations will be in flux at any given time, claiming any sort of API stability at the code level across the whole project seems impossible any time soon. The important commitment of a 1.0 release is that the metadata and m

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
1.0 is a Big Deal because, under semantic versioning, there is a commitment to not change public APIs. If it weren’t for that, 1.0 would have vague marketing connotations of robustness, adoption etc. but otherwise be no different from another release. So, if API and data format lifecycle and co

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
I created https://issues.apache.org/jira/browse/ARROW-1277 about integration testing remaining data types. We are so close to having everything tested and stable, we should push to complete these as soon as possible (save for Map, which has only just been added to the metadata) On Mon, Jul 24, 201

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-24 Thread Wes McKinney
I agree those things would be nice to have. Hardening the memory format details probably would not take longer than a month or so if we were to focus in on it. Formalizing REST / RPC or IPC seems like it will be more work, or will require a design period and then initial implementation. I think ha

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-24 Thread Jacques Nadeau
Top things on my list: - Formalize Arrow RPC and/or REST - Some reference transformation algorithms - Prototype IPC On Mon, Jul 24, 2017 at 9:47 AM, Wes McKinney wrote: > hi folks, > > In recent discussions, since the Arrow memory format and metadata has > become reasonably stabilized, and we'r