Re: [DISCUSS] Expanding Arrow interval type metadata, changing Java memory representation

2017-10-19 Thread Li Jin
+1 on this one. My reason is this makes timestamp/interval calculation faster, i.e, "timestamp + interval < timestamp" should be faster without dealing with two component in interval. Although I am not quite sure about the rational behind the two component representation, which seems to be what is

Re: [DISCUSS] Removing the "page" field from the Buffer record batch Arrow metadata

2017-10-19 Thread Li Jin
+1 for the change too.

[jira] [Created] (ARROW-1692) [Python, Java] UnionArray round trip not working

2017-10-19 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-1692: - Summary: [Python, Java] UnionArray round trip not working Key: ARROW-1692 URL: https://issues.apache.org/jira/browse/ARROW-1692 Project: Apache Arrow Issue

Fwd: [DISCUSS] Storage-class memory ecosystem program

2017-10-19 Thread Julian Hyde
This thread on general@incubator may be of interest to Arrow. Julian > Begin forwarded message: > > From: "Gang(Gary) Wang" > Subject: [DISCUSS] Storage-class memory ecosystem program > Date: October 19, 2017 at 11:55:46 AM PDT > To: gene...@incubator.apache.org > Cc: d...@mnemonic.incubator.

Arrow JS tasks and roadmap

2017-10-19 Thread Paul Taylor
Brian Hulette and I have outlined this list of tasks/improvements for the expanded Arrow JS implementation: https://docs.google.com/document/d/142dek89oM2TVI2Yql106Zo8IB1Ff_9zDg_EG6jPWS0M/edit?usp=sharing

Re: [DISCUSS] Removing the "page" field from the Buffer record batch Arrow metadata

2017-10-19 Thread Philipp Moritz
+1 for the change I'm all for making the metadata small and to solve it in a different if the field is really needed. Users who do not need the feature shouldn't have to pay for it. On Thu, Oct 19, 2017 at 6:02 PM, Wes McKinney wrote: > The JIRA for this is https://issues.apache.org/jira/browse

[jira] [Created] (ARROW-1691) [Java] Conform Java Decimal type implementation to format decisions in ARROW-1588

2017-10-19 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1691: --- Summary: [Java] Conform Java Decimal type implementation to format decisions in ARROW-1588 Key: ARROW-1691 URL: https://issues.apache.org/jira/browse/ARROW-1691 Project

Re: [DISCUSS] Removing the "page" field from the Buffer record batch Arrow metadata

2017-10-19 Thread Wes McKinney
The JIRA for this is https://issues.apache.org/jira/browse/ARROW-1409. I will wait a little while for others to weigh in, but after that I can write a patch to remove the attribute and bump the metadata format version number. On Thu, Oct 19, 2017 at 4:37 PM, Bryan Cutler wrote: > +1, sounds ok to

Re: [DISCUSS] Removing the "page" field from the Buffer record batch Arrow metadata

2017-10-19 Thread Bryan Cutler
+1, sounds ok to me to try to solve this problem a different way in the future once needed. On Thu, Oct 19, 2017 at 12:30 PM, Jacques Nadeau wrote: > Seems reasonable. I was among those that originally argued for this field > but given that we haven't used it yet, I think your proposal makes sen

Re: [DISCUSS] Removing the "page" field from the Buffer record batch Arrow metadata

2017-10-19 Thread Jacques Nadeau
Seems reasonable. I was among those that originally argued for this field but given that we haven't used it yet, I think your proposal makes sense. +1 On Wed, Oct 18, 2017 at 5:40 PM, Wes McKinney wrote: > When we originally drafted the metadata for record batches, we > included a "page id" in

More on Splitting

2017-10-19 Thread Clay Baenziger (BLOOMBERG/ 731 LEX)
Hi Hector and Wes, To consider different splitting heuristics the following is a good read: https://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ Particularly: * ConstantSizeRegionSplitPolicy * IncreasingToUpperBoundRegionSplitPolicy * KeyPrefixRegionSplitPolicy Also, as you gu

Re: [DISCUSS] Arrow release management for 0.8.0 and beyond

2017-10-19 Thread Wes McKinney
I would be fine with splitting up the work. For example, for the 0.8.0 release perhaps another PMC can do the source release (to make sure we have fully documented the system requirements necessary for cutting the release tarball) and I can take care of updating the website, and others can handle t

Re: [DISCUSS] Arrow release management for 0.8.0 and beyond

2017-10-19 Thread Bryan Cutler
Wes, thanks for taking on so much of the release management so far! I'd be glad to help out with the next release. From the document you wrote, are you wanting someone to fully do the next release or would it work if we divide up some of the tasks? Bryan On Wed, Oct 18, 2017 at 6:06 PM, Wes McK

[jira] [Created] (ARROW-1690) [GLib] Add garrow_array_is_valid()

2017-10-19 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-1690: --- Summary: [GLib] Add garrow_array_is_valid() Key: ARROW-1690 URL: https://issues.apache.org/jira/browse/ARROW-1690 Project: Apache Arrow Issue Type: New Feature

[jira] [Created] (ARROW-1689) [Python] Categorical Indices Should Be Zero-Copy

2017-10-19 Thread Nick White (JIRA)
Nick White created ARROW-1689: - Summary: [Python] Categorical Indices Should Be Zero-Copy Key: ARROW-1689 URL: https://issues.apache.org/jira/browse/ARROW-1689 Project: Apache Arrow Issue Type: I