[DISCUSS] Arrow release management for 0.8.0 and beyond

2017-10-18 Thread Wes McKinney
hi folks, I wrote up a document describing the work involved with an Arrow release: https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md I've managed the last 7 releases -- I think it would be good for others committers or PMC members to be exposed to the work involved w

[DISCUSS] Removing the "page" field from the Buffer record batch Arrow metadata

2017-10-18 Thread Wes McKinney
When we originally drafted the metadata for record batches, we included a "page id" in the Buffer struct: https://github.com/apache/arrow/blob/master/format/Schema.fbs#L295 The idea at the time was that record batches might not be colocated in a particular shared memory page. This might still hap

[DISCUSS] Expanding Arrow interval type metadata, changing Java memory representation

2017-10-18 Thread Wes McKinney
I opened this patch over 2 months ago to add some additional metadata for intervals: https://github.com/apache/arrow/pull/920 Java supports a two-component DAY_TIME interval type as a combo of days and milliseconds: https://github.com/apache/arrow/blob/402baa4ec391b61dd37c770ae7978d51b9b550fa/ja

[jira] [Created] (ARROW-1688) [Java] Fail build on checkstyle warnings

2017-10-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1688: --- Summary: [Java] Fail build on checkstyle warnings Key: ARROW-1688 URL: https://issues.apache.org/jira/browse/ARROW-1688 Project: Apache Arrow Issue Type: Impro

[jira] [Created] (ARROW-1687) [Python] Expose UnionArray to pyarrow

2017-10-18 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-1687: - Summary: [Python] Expose UnionArray to pyarrow Key: ARROW-1687 URL: https://issues.apache.org/jira/browse/ARROW-1687 Project: Apache Arrow Issue Type: Impr

Re: Arrow sync at 16:00 UTC (12pm US/Eastern) today

2017-10-18 Thread Wes McKinney
Short call today, notes on attendees and topics: - Wes (Two Sigma) - Release management - 0.8.0 release - Jacques (Dremio) - Heimir Sverrisson (Mojotech) - Docker testing for Dask - Clark Fitzgerald (UC Davis) - Sidd (Dremio) - ARROW-1463 update - Li (Two Sigma) - Bryan Cutler (IBM) - Phil

[jira] [Created] (ARROW-1686) Documentation generation script creates "apidocs" directory under site/java

2017-10-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1686: --- Summary: Documentation generation script creates "apidocs" directory under site/java Key: ARROW-1686 URL: https://issues.apache.org/jira/browse/ARROW-1686 Project: Apac

Re: Arrow sync at 16:00 UTC (12pm US/Eastern) today

2017-10-18 Thread Jacques Nadeau
We can use this on an ongoing basis: https://meet.google.com/vtm-teks-phx On Wed, Oct 18, 2017 at 6:59 AM, Wes McKinney wrote: > Could someone post a hangout/Google Meet link capable of hosting more > than 10 participants? > > Thanks >

[jira] [Created] (ARROW-1685) [GLib] Add GArrowTableReader

2017-10-18 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-1685: --- Summary: [GLib] Add GArrowTableReader Key: ARROW-1685 URL: https://issues.apache.org/jira/browse/ARROW-1685 Project: Apache Arrow Issue Type: New Feature

Re: Arrow sync at 16:00 UTC (12pm US/Eastern) today

2017-10-18 Thread Matt Meinel
freeconferencecall.com works well for my startup. Up to 1000 callers I think. Sent from my iPhone > On Oct 18, 2017, at 8:59 AM, Wes McKinney wrote: > > Could someone post a hangout/Google Meet link capable of hosting more > than 10 participants? > > Thanks -- Levyx | 49 Discovery, Suite

[jira] [Created] (ARROW-1684) [Python] Simplify user API for reading nested Parquet columns

2017-10-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1684: --- Summary: [Python] Simplify user API for reading nested Parquet columns Key: ARROW-1684 URL: https://issues.apache.org/jira/browse/ARROW-1684 Project: Apache Arrow

Arrow sync at 16:00 UTC (12pm US/Eastern) today

2017-10-18 Thread Wes McKinney
Could someone post a hangout/Google Meet link capable of hosting more than 10 participants? Thanks

[jira] [Created] (ARROW-1683) [Python] Restore "TimestampType" to pyarrow namespace

2017-10-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1683: --- Summary: [Python] Restore "TimestampType" to pyarrow namespace Key: ARROW-1683 URL: https://issues.apache.org/jira/browse/ARROW-1683 Project: Apache Arrow Issu

[jira] [Created] (ARROW-1682) [Python] Add documentation / example for reading a directory of Parquet files on S3

2017-10-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1682: --- Summary: [Python] Add documentation / example for reading a directory of Parquet files on S3 Key: ARROW-1682 URL: https://issues.apache.org/jira/browse/ARROW-1682 Proje

[jira] [Created] (ARROW-1681) [Python] Error writing with nulls in lists

2017-10-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1681: --- Summary: [Python] Error writing with nulls in lists Key: ARROW-1681 URL: https://issues.apache.org/jira/browse/ARROW-1681 Project: Apache Arrow Issue Type: Bug