[jira] [Created] (ARROW-1284) Windows can't install pyarrow 0.4.1 and 0.5.0

2017-07-26 Thread nooper wang (JIRA)
nooper wang created ARROW-1284: -- Summary: Windows can't install pyarrow 0.4.1 and 0.5.0 Key: ARROW-1284 URL: https://issues.apache.org/jira/browse/ARROW-1284 Project: Apache Arrow Issue Type: Bu

[jira] [Created] (ARROW-1283) [Java] VectorSchemaRoot should be able to be closed() more than once

2017-07-26 Thread Bryan Cutler (JIRA)
Bryan Cutler created ARROW-1283: --- Summary: [Java] VectorSchemaRoot should be able to be closed() more than once Key: ARROW-1283 URL: https://issues.apache.org/jira/browse/ARROW-1283 Project: Apache Arro

[jira] [Created] (ARROW-1282) Large memory reallocation by Arrow causes hang in jemalloc

2017-07-26 Thread Jeff Knupp (JIRA)
Jeff Knupp created ARROW-1282: - Summary: Large memory reallocation by Arrow causes hang in jemalloc Key: ARROW-1282 URL: https://issues.apache.org/jira/browse/ARROW-1282 Project: Apache Arrow Iss

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
The combinatorics of code-level API stability are worrisome (with already 5 different language APIs in the project) while the maturity and development pace of different implementations may remain variable for some time. There are two possible things we can communicate with some form of major versi

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
I agree with all that. But semantic versioning only pertains to public APIs. So, for it to work, you need to declare what are your public APIs. If you don’t, people will make assumptions about what are your public APIs, and they may get it wrong. The ability to add experimental APIs (not subjec

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
Yes, definitely, sorry to not make that more clear. As part of this process we should draw up a documentation page about how to interpret the version numbers as a third party user, and how we will handle documenting experimental features. For example, we might add an experimental new logical type a

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
It sounds as if you agree with me: It is very important that we clearly state which bits of Arrow are fixed and which bits are not. > On Jul 26, 2017, at 11:56 AM, Wes McKinney wrote: > > Given the nature of the Arrow project, where any number of different > implementations will be in flux at a

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
I see the semantic versioning like this: Major version: Format and Metadata stability Minor version: API stability within fix versions Fix version: Bug fixes So an API might be deprecated from 1.0.0 to 1.1.0, but we could not make a breaking change to the memory format without increasing the majo

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
Given the nature of the Arrow project, where any number of different implementations will be in flux at any given time, claiming any sort of API stability at the code level across the whole project seems impossible any time soon. The important commitment of a 1.0 release is that the metadata and m

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Julian Hyde
1.0 is a Big Deal because, under semantic versioning, there is a commitment to not change public APIs. If it weren’t for that, 1.0 would have vague marketing connotations of robustness, adoption etc. but otherwise be no different from another release. So, if API and data format lifecycle and co

[jira] [Created] (ARROW-1281) [C++/Python] Add Docker setup for testing HDFS tests and other tests we may not run in Travis CI

2017-07-26 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1281: --- Summary: [C++/Python] Add Docker setup for testing HDFS tests and other tests we may not run in Travis CI Key: ARROW-1281 URL: https://issues.apache.org/jira/browse/ARROW-1281

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-26 Thread Wes McKinney
I created https://issues.apache.org/jira/browse/ARROW-1277 about integration testing remaining data types. We are so close to having everything tested and stable, we should push to complete these as soon as possible (save for Map, which has only just been added to the metadata) On Mon, Jul 24, 201

[jira] [Created] (ARROW-1280) [C++] Implement Fixed Size List type

2017-07-26 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1280: --- Summary: [C++] Implement Fixed Size List type Key: ARROW-1280 URL: https://issues.apache.org/jira/browse/ARROW-1280 Project: Apache Arrow Issue Type: New Featu

[jira] [Created] (ARROW-1279) Integration tests for Map type

2017-07-26 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1279: --- Summary: Integration tests for Map type Key: ARROW-1279 URL: https://issues.apache.org/jira/browse/ARROW-1279 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-1278) Integration tests for Fixed Size List type

2017-07-26 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1278: --- Summary: Integration tests for Fixed Size List type Key: ARROW-1278 URL: https://issues.apache.org/jira/browse/ARROW-1278 Project: Apache Arrow Issue Type: New

[jira] [Created] (ARROW-1277) Completing integration tests for major implemented data types

2017-07-26 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1277: --- Summary: Completing integration tests for major implemented data types Key: ARROW-1277 URL: https://issues.apache.org/jira/browse/ARROW-1277 Project: Apache Arrow

[jira] [Created] (ARROW-1276) Cannot serializer empty DataFrame to parquet

2017-07-26 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-1276: Summary: Cannot serializer empty DataFrame to parquet Key: ARROW-1276 URL: https://issues.apache.org/jira/browse/ARROW-1276 Project: Apache Arrow Issue Type: