[jira] [Created] (ARROW-1253) [C++] Use pre-built toolchain libraries where prudent to speed up CI builds

2017-07-23 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1253: --- Summary: [C++] Use pre-built toolchain libraries where prudent to speed up CI builds Key: ARROW-1253 URL: https://issues.apache.org/jira/browse/ARROW-1253 Project: Apac

[jira] [Created] (ARROW-1252) [Website] Update for 0.5.0 release, add blog post summarizing changes from 0.4.x

2017-07-23 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1252: --- Summary: [Website] Update for 0.5.0 release, add blog post summarizing changes from 0.4.x Key: ARROW-1252 URL: https://issues.apache.org/jira/browse/ARROW-1252 Project:

Re: Parquet+Arrow Java

2017-07-23 Thread Wes McKinney
hi Masayuki, I don't have direct experience using Arrow with Parquet in Java, but a common approach is to set a batch size (number of logical rows) and compute a sequence of Arrow record batches converted from the Parquet file. We are only supporting monolithic file and row group reads in C++ (ht

[jira] [Created] (ARROW-1251) [Python/C++] Revise build documentation to account for latest build toolchain

2017-07-23 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1251: --- Summary: [Python/C++] Revise build documentation to account for latest build toolchain Key: ARROW-1251 URL: https://issues.apache.org/jira/browse/ARROW-1251 Project: Ap

Re: [VOTE] Release Apache Arrow 0.5.0 - RC2

2017-07-23 Thread Wes McKinney
hi Colin, Thanks for noting the removal of pyarrow.TimestampType from the public API. I created https://issues.apache.org/jira/browse/ARROW-1250 about coming up with a more comprehensive API for type checking or schema validation - Wes On Sun, Jul 23, 2017 at 1:38 PM, Colin Nichols wrote: > +1

[jira] [Created] (ARROW-1250) [Python] Define API for user type checking of array types

2017-07-23 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1250: --- Summary: [Python] Define API for user type checking of array types Key: ARROW-1250 URL: https://issues.apache.org/jira/browse/ARROW-1250 Project: Apache Arrow

[RESULT] [VOTE] Accept contribution of Plasma Object Store

2017-07-23 Thread Wes McKinney
With 6 +1 votes (3 binding), and no +0 or -1 votes, the vote to determine whether the Arrow PMC accepts the Plasma Object Store passes. Thank you to all who voted in support of this. The Arrow PMC will work with the code authors to complete the ASF IP Clearance process so that this code can be inc

[RESULT] [VOTE] Release Apache Arrow 0.5.0 - RC2

2017-07-23 Thread Wes McKinney
Thanks everyone for voting. With 3 binding +1 votes, 1 non-binding +1 and a +0, the vote passes. I will work later today on drafting a release announcement and corresponding website updates. - Wes On Sun, Jul 23, 2017 at 1:38 PM, Colin Nichols wrote: > +1 > > - Ran manylinux1 Python build+tests

Re: [VOTE] Release Apache Arrow 0.5.0 - RC2

2017-07-23 Thread Colin Nichols
+1 - Ran manylinux1 Python build+tests - installed resulting wheel on ubuntu 16.04 Had to update imports of TimestampType in my code since it was removed from the top level. I am using it for type checking of columns. -- sent from my phone -- > On Jul 23, 2017, at 10:33, Julien Le Dem wrote:

Re: [VOTE] Accept contribution of Plasma Object Store

2017-07-23 Thread Arun K. Subramaniyan
+1 On Sun, Jul 23, 2017 at 1:16 AM Uwe L. Korn wrote: > +1 > > On Fri, Jul 21, 2017, at 01:37 AM, Julian Hyde wrote: > > +1 > > > > > On Jul 20, 2017, at 3:07 PM, Bryan Cutler wrote: > > > > > > +1 sounds great! > > > > > > On Thu, Jul 20, 2017 at 11:14 AM, Wes McKinney > wrote: > > > > > >> D

Re: [VOTE] Release Apache Arrow 0.5.0 - RC2

2017-07-23 Thread Julien Le Dem
+1 (binding) on MacOS: * Verified signature * ran java build, unit tests, packages * build and ran test for C++ 1 note: - missing from the build notes: new jemalloc dependency (I had to brew install jemalloc) On Sun, Jul 23, 2017 at 6:20 AM, Uwe L. Korn wrote: > +1 (binding) > > * Verified sig

Re: [VOTE] Release Apache Arrow 0.5.0 - RC2

2017-07-23 Thread Uwe L. Korn
+1 (binding) * Verified signature and checksum * Ran Java unit tests, packaged to JARs * Build and run C++ & Python unit tests on Debian 7 / Debian 8 with gcc5.4 * Build and run C++ & Python unit tests in manylinux1 container on Centos 5 * Ran Python unit tests including --with-parquet Uwe On Fr

Re: Parquet+Arrow Java

2017-07-23 Thread Masayuki Takahashi
Hi, I try to convert Parquet files to Arrow. https://gist.github.com/masayuki038/4be6c8538dfd4563a8d5ff743cf375ae And I have a question. When converting Parquet to Arrow, is it the right idea to make Arrow's VectorSchemaRoot for each RowGroup of Parquet? thanks. 2017-07-21 5:19 GMT+09:00 Wes M

Re: Parquet+Arrow Java

2017-07-23 Thread Michael Shtelma
yes, this would be great to have a component/library, that can be embedded in any other product and be able to perform operations like aggregation/join/filter/etc with arrow datasets. Do you think it is really hard to extract this part out of dremio-oss ? Sincerely, Michael Shtelma On Sat, Jul 2

Re: [VOTE] Accept contribution of Plasma Object Store

2017-07-23 Thread Uwe L. Korn
+1 On Fri, Jul 21, 2017, at 01:37 AM, Julian Hyde wrote: > +1 > > > On Jul 20, 2017, at 3:07 PM, Bryan Cutler wrote: > > > > +1 sounds great! > > > > On Thu, Jul 20, 2017 at 11:14 AM, Wes McKinney wrote: > > > >> Dear all, > >> > >> The Plasma Object Store provides a server process, referen