Re: JDBC Adapter for Apache-Arrow

2017-10-30 Thread Julian Hyde
How about writing an Arrow adapter for Calcite? I think it amounts to the same thing - you would inherit Calcite’s SQL parser and Avatica JDBC stack. Would this database be ephemeral (i.e. would the data go away when you close the connection)? If not, how would you know where to load the data f

JDBC Adapter for Apache-Arrow

2017-10-30 Thread Atul Dambalkar
Hi all, I wanted to open up a conversation here regarding developing a Java-based JDBC Adapter for Apache Arrow. I have had a preliminary discussion with Wes McKinney and Siddharth Teotia on this a couple weeks earlier. Basically at a high level (over-simplified) this adapter/API will allow up

Re: Faster PySpark UDFs using Apache Arrow in Spark 2.3.0

2017-10-30 Thread Phillip Cloud
Congrats Li! This is awesome. On Mon, Oct 30, 2017 at 2:05 PM Wes McKinney wrote: > hi all, > > One of our newest committers, Li Jin, has been driving efforts to > speed up Python UDFs in Spark using Arrow. This was just written about > today: > > > https://databricks.com/blog/2017/10/30/introdu

[jira] [Created] (ARROW-1757) [C++] Add DictionaryArray::FromArrays alternate ctor that can check or sanitized "untrusted" indices

2017-10-30 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1757: --- Summary: [C++] Add DictionaryArray::FromArrays alternate ctor that can check or sanitized "untrusted" indices Key: ARROW-1757 URL: https://issues.apache.org/jira/browse/ARROW-1757

Arrow 0.8.0 release timeline

2017-10-30 Thread Wes McKinney
hi folks, With the way things are looking, I think we should aim to have all 0.8.0 work wrapped up by the end of next week, and hopefully be positioned to make a release candidate on Monday Nov 13 or Tuesday Nov 14. There are still 53 JIRAs in TODO in the release milestone and several in progress

Faster PySpark UDFs using Apache Arrow in Spark 2.3.0

2017-10-30 Thread Wes McKinney
hi all, One of our newest committers, Li Jin, has been driving efforts to speed up Python UDFs in Spark using Arrow. This was just written about today: https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html It's really exciting to see this kind of cross-project colla

[jira] [Created] (ARROW-1756) [Python] Observed int32 overflow in Feather write/read path

2017-10-30 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1756: --- Summary: [Python] Observed int32 overflow in Feather write/read path Key: ARROW-1756 URL: https://issues.apache.org/jira/browse/ARROW-1756 Project: Apache Arrow

[jira] [Created] (ARROW-1755) [C++] Add build options for MSVC to use static runtime libraries

2017-10-30 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1755: --- Summary: [C++] Add build options for MSVC to use static runtime libraries Key: ARROW-1755 URL: https://issues.apache.org/jira/browse/ARROW-1755 Project: Apache Arrow

[jira] [Created] (ARROW-1754) [Python] Fix buggy Parquet roundtrip when an index name is the same as a column name

2017-10-30 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1754: --- Summary: [Python] Fix buggy Parquet roundtrip when an index name is the same as a column name Key: ARROW-1754 URL: https://issues.apache.org/jira/browse/ARROW-1754 Proj

[jira] [Created] (ARROW-1753) [Python] Provide for matching subclasses with register_type in serialization context

2017-10-30 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1753: --- Summary: [Python] Provide for matching subclasses with register_type in serialization context Key: ARROW-1753 URL: https://issues.apache.org/jira/browse/ARROW-1753 Proj

[GitHub] xhochy closed pull request #18: ARROW-1752: Add GPU packages for Debian and Ubuntu

2017-10-30 Thread GitBox
xhochy closed pull request #18: ARROW-1752: Add GPU packages for Debian and Ubuntu URL: https://github.com/apache/arrow-dist/pull/18 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a fo

[GitHub] kou opened a new pull request #18: ARROW-1752: Add GPU packages for Debian and Ubuntu

2017-10-30 Thread GitBox
kou opened a new pull request #18: ARROW-1752: Add GPU packages for Debian and Ubuntu URL: https://github.com/apache/arrow-dist/pull/18 We need two Travis CI entry to build all .debs because we don't have enough space to build all .debs in one Travis CI entry.

[jira] [Created] (ARROW-1752) [Packaging] Add GPU packages for Debian and Ubuntu

2017-10-30 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-1752: --- Summary: [Packaging] Add GPU packages for Debian and Ubuntu Key: ARROW-1752 URL: https://issues.apache.org/jira/browse/ARROW-1752 Project: Apache Arrow Issue T