[jira] [Created] (ARROW-7830) [C++] Parquet library version doesn't change with releases

2020-02-10 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-7830: -- Summary: [C++] Parquet library version doesn't change with releases Key: ARROW-7830 URL: https://issues.apache.org/jira/browse/ARROW-7830 Project: Apache Arrow

[jira] [Created] (ARROW-7829) [R] Test R bindings on clang

2020-02-10 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-7829: -- Summary: [R] Test R bindings on clang Key: ARROW-7829 URL: https://issues.apache.org/jira/browse/ARROW-7829 Project: Apache Arrow Issue Type: Improvement

[jira] [Created] (ARROW-7828) [Release] Remove SSH keys for internal use

2020-02-10 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-7828: --- Summary: [Release] Remove SSH keys for internal use Key: ARROW-7828 URL: https://issues.apache.org/jira/browse/ARROW-7828 Project: Apache Arrow Issue Type: Imp

[jira] [Created] (ARROW-7827) conda-forge pyarrow package does not have s3 enabled

2020-02-10 Thread Catherine (Jira)
Catherine created ARROW-7827: Summary: conda-forge pyarrow package does not have s3 enabled Key: ARROW-7827 URL: https://issues.apache.org/jira/browse/ARROW-7827 Project: Apache Arrow Issue Type:

Re: [Format] Dictionary edge cases (encoding nulls and nested dictionaries)

2020-02-10 Thread Micah Kornfield
Hi Wes and Brian, Thanks for the feedback. My intent in raising these issues is that they make the spec harder to work with/implement (i.e. we have existing bugs, etc). I'm wondering if we should take the opportunity to simplify before things are set in stone. If we think things are already set,

Re: AttributeError importing pyarrow 0.16.0

2020-02-10 Thread Tom Augspurger
Thanks for linking to that. The Python there does seems problematic. Upgrading to TravisCI's "bionic" image (with Python 3.7.5 instead of 3.7.1) seems to have fixed it. Tom On Mon, Feb 10, 2020 at 1:34 PM Wes McKinney wrote: > hi Tom, > > Looks like it could be https://bugs.python.org/issue3297

Re: [Format] Dictionary edge cases (encoding nulls and nested dictionaries)

2020-02-10 Thread Wes McKinney
On Sun, Feb 9, 2020 at 12:53 AM Micah Kornfield wrote: > > I'd like to understand if any one is making use of the following features > and if we should revisit them before 1.0. > > 1. Dictionaries can encode null values. > - This become error prone for things like parquet. We seem to be > calcula

Re: AttributeError importing pyarrow 0.16.0

2020-02-10 Thread Wes McKinney
hi Tom, Looks like it could be https://bugs.python.org/issue32973, but I'm not sure. I wasn't able to reproduce locally. The Python version 3.7.1 running in CI is also potentially suspicious. This class of error seems to have a lot of bug reports based on Google searches Message isn't picklable

[jira] [Created] (ARROW-7826) [Python] Implement __reduce__ for pyarrow.lib.Message

2020-02-10 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-7826: --- Summary: [Python] Implement __reduce__ for pyarrow.lib.Message Key: ARROW-7826 URL: https://issues.apache.org/jira/browse/ARROW-7826 Project: Apache Arrow Issu

AttributeError importing pyarrow 0.16.0

2020-02-10 Thread Tom Augspurger
Hi all, I'm seeing a strange issue when importing pyarrow on the intake CI. I get an exception saying AttributeError: type object 'pyarrow.lib.Message' has no attribute '__reduce_cython__' The full traceback is: __ test_arrow_import __

[jira] [Created] (ARROW-7825) Have arrow::read_parquet respect options(stringsAsFactors = FALSE)

2020-02-10 Thread Keith Hughitt (Jira)
Keith Hughitt created ARROW-7825: Summary: Have arrow::read_parquet respect options(stringsAsFactors = FALSE) Key: ARROW-7825 URL: https://issues.apache.org/jira/browse/ARROW-7825 Project: Apache Arro

[jira] [Created] (ARROW-7824) [C++][Dataset] Provide Dataset writing to IPC format

2020-02-10 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-7824: --- Summary: [C++][Dataset] Provide Dataset writing to IPC format Key: ARROW-7824 URL: https://issues.apache.org/jira/browse/ARROW-7824 Project: Apache Arrow Issue

[jira] [Created] (ARROW-7823) Have feather::read_feather respect options(stringsAsFactors = FALSE)

2020-02-10 Thread Keith Hughitt (Jira)
Keith Hughitt created ARROW-7823: Summary: Have feather::read_feather respect options(stringsAsFactors = FALSE) Key: ARROW-7823 URL: https://issues.apache.org/jira/browse/ARROW-7823 Project: Apache Ar

[jira] [Created] (ARROW-7822) [C++] Allocation free error Status constants

2020-02-10 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-7822: --- Summary: [C++] Allocation free error Status constants Key: ARROW-7822 URL: https://issues.apache.org/jira/browse/ARROW-7822 Project: Apache Arrow Issue Type: I

[jira] [Created] (ARROW-7821) [Gandiva] Add support for literal variables

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7821: - Summary: [Gandiva] Add support for literal variables Key: ARROW-7821 URL: https://issues.apache.org/jira/browse/ARROW-7821 Project: Apache Arrow

[jira] [Created] (ARROW-7820) [C++][Gandiva] Add CMake support for compiling LLVM's IR into a library

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7820: - Summary: [C++][Gandiva] Add CMake support for compiling LLVM's IR into a library Key: ARROW-7820 URL: https://issues.apache.org/jira/browse/ARROW-7820

[jira] [Created] (ARROW-7819) [C++][Gandiva] Implement gandiva-dump-ir tool to output llvm IR to a file

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7819: - Summary: [C++][Gandiva] Implement gandiva-dump-ir tool to output llvm IR to a file Key: ARROW-7819 URL: https://issues.apache.org/jira/browse/ARROW-7819

[jira] [Created] (ARROW-7818) [C++][Gandiva] Generate Filter kernels from gandiva code at compile time

2020-02-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-7818: - Summary: [C++][Gandiva] Generate Filter kernels from gandiva code at compile time Key: ARROW-7818 URL: https://issues.apache.org/jira/browse/ARROW-7818

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-10 Thread Neal Richardson
We also have the release blog post to finish up and publish: https://github.com/apache/arrow-site/pull/41 On Mon, Feb 10, 2020 at 7:32 AM Krisztián Szűcs wrote: > Updated checklist: > > - [DONE] marking the released version as "RELEASED" on JIRA > - [DONE] rebase the master branch on top of the

Re: Arrow Datasets Functionality for Python

2020-02-10 Thread Wes McKinney
I will add that I'm interested in being involved with developing high level Python interfaces to all of this functionality (e.g. using Ibis [1]). It would be worth prototyping at least a datasets interface layer for efficient data selection (predicate pushdown + filtering) and then expanding to sup

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-10 Thread Krisztián Szűcs
Updated checklist: - [DONE] marking the released version as "RELEASED" on JIRA - [DONE] rebase the master branch on top of the release branch - [DONE] rebase the pull requests - [DONE] uploading source release artifacts to SVN - [DONE] uploading binary release artifacts to Bintray - [DONE] updatin

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-10 Thread Krisztián Szűcs
Indeed, it is here https://lists.apache.org/thread.html/rf3a0840c152d7eeefb6862c3a54f986595f88b439b0c82780d15f998%40%3Cdev.arrow.apache.org%3E On Mon, Feb 10, 2020 at 4:25 PM Francois Saint-Jacques wrote: > > I received it. > > On Mon, Feb 10, 2020 at 10:24 AM Krisztián Szűcs > wrote: > > > > I'

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-10 Thread Francois Saint-Jacques
I received it. On Mon, Feb 10, 2020 at 10:24 AM Krisztián Szűcs wrote: > > I've already announced the release 4 hours ago with my @apache address. > Although I cannot see it in the archive [1]. > > I suppose it didn't go through, no clue why. > > [1] https://lists.apache.org/list.html?dev@arrow.a

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-10 Thread Krisztián Szűcs
I've already announced the release 4 hours ago with my @apache address. Although I cannot see it in the archive [1]. I suppose it didn't go through, no clue why. [1] https://lists.apache.org/list.html?dev@arrow.apache.org On Mon, Feb 10, 2020 at 3:57 PM Wes McKinney wrote: > > Can we announce

[jira] [Created] (ARROW-7817) [Integration] macOS R autobrew nightly fails

2020-02-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-7817: -- Summary: [Integration] macOS R autobrew nightly fails Key: ARROW-7817 URL: https://issues.apache.org/jira/browse/ARROW-7817 Project: Apache Arrow Issue T

Re: [NIGHTLY] Arrow Build Report for Job nightly-2020-02-10-0

2020-02-10 Thread Krisztián Szűcs
On Mon, Feb 10, 2020 at 2:47 PM Crossbow wrote: > > > Arrow Build Report for Job nightly-2020-02-10-0 > > All tasks: > https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-10-0 > > Failed Tasks: > - macos-r-autobrew: > URL: > https://github.com/ursa-labs/crossbow/branches/a

[jira] [Created] (ARROW-7816) [Integration] Turbodbc fails to compile in the nightly tests

2020-02-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-7816: -- Summary: [Integration] Turbodbc fails to compile in the nightly tests Key: ARROW-7816 URL: https://issues.apache.org/jira/browse/ARROW-7816 Project: Apache Arrow

[jira] [Created] (ARROW-7815) [C++] Fix crashes on corrupt IPC input (OSS-Fuzz)

2020-02-10 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-7815: - Summary: [C++] Fix crashes on corrupt IPC input (OSS-Fuzz) Key: ARROW-7815 URL: https://issues.apache.org/jira/browse/ARROW-7815 Project: Apache Arrow Issu

Re: [VOTE] Release Apache Arrow 0.16.0 - RC2

2020-02-10 Thread Wes McKinney
Can we announce the release? Let me know if I can help with anything On Sun, Feb 9, 2020 at 6:23 PM Sutou Kouhei wrote: > > Hi, > > MSYS2 package is updated: > https://github.com/msys2/MINGW-packages/pull/6175 > > > Thanks, > -- > kou > > In > "Re: [VOTE] Release Apache Arrow 0.16.0 - RC2" on

Re: Arrow Datasets Functionality for Python

2020-02-10 Thread Francois Saint-Jacques
Hello Matthew, The dplyr binding is just syntactic sugar on top of the dataset API. There's no analytics capabilities yet [1], other than the select and the limited projection supported by the dataset API. It looks like it is doing analytics due to properly placed `collect()` calls, which converts

[NIGHTLY] Arrow Build Report for Job nightly-2020-02-10-0

2020-02-10 Thread Crossbow
Arrow Build Report for Job nightly-2020-02-10-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-10-0 Failed Tasks: - macos-r-autobrew: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-10-0-travis-macos-r-autobrew - test-conda-

[jira] [Created] (ARROW-7814) [C++] Simplify build-support/run-test.sh

2020-02-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-7814: -- Summary: [C++] Simplify build-support/run-test.sh Key: ARROW-7814 URL: https://issues.apache.org/jira/browse/ARROW-7814 Project: Apache Arrow Issue Type

[ANNOUNCE] Apache Arrow 0.16.0 released

2020-02-10 Thread Krisztián Szűcs
The Apache Arrow community is pleased to announce the 0.16.0 release. The release includes 735 resolved issues ([1]) since the 0.15.0 release. The release is available now from our website, [2] and [3]: https://arrow.apache.org/install/ Release notes are available at: https://arrow.apache

[jira] [Created] (ARROW-7813) Fix undefined behaviour and and remove unsafe

2020-02-10 Thread Markus Westerlind (Jira)
Markus Westerlind created ARROW-7813: Summary: Fix undefined behaviour and and remove unsafe Key: ARROW-7813 URL: https://issues.apache.org/jira/browse/ARROW-7813 Project: Apache Arrow Is