[jira] [Created] (ARROW-1263) [C++] CpuInfo should be able to get CPU features on Windows

2017-07-24 Thread Max Risuhin (JIRA)
Max Risuhin created ARROW-1263: -- Summary: [C++] CpuInfo should be able to get CPU features on Windows Key: ARROW-1263 URL: https://issues.apache.org/jira/browse/ARROW-1263 Project: Apache Arrow

Re: Use case for R Arrow Bindings

2017-07-24 Thread Clark Fitzgerald
Great, I'll be on the call. The first steps I took today with the automatically generated bindings from the C++ source seem promising. Much more work is required to make it usable though. On Mon, Jul 24, 2017 at 9:00 PM, Kevin Moore wrote: > A group of Quilt users and team members interested in

Re: Use case for R Arrow Bindings

2017-07-24 Thread Kevin Moore
A group of Quilt users and team members interested in R is planning a short call to get the ball rolling on R bindings for Arrow (and Quilt) tomorrow at 4PM Pacific. We'd love to have anyone who's interested from this list join us in the hangout: https://hangouts.google.com/hangouts/_/quiltdata.io/

[jira] [Created] (ARROW-1262) Packaging automation in arrow-dist

2017-07-24 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1262: --- Summary: Packaging automation in arrow-dist Key: ARROW-1262 URL: https://issues.apache.org/jira/browse/ARROW-1262 Project: Apache Arrow Issue Type: Task

[jira] [Created] (ARROW-1261) [Java] Add container type for Map logical type

2017-07-24 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1261: --- Summary: [Java] Add container type for Map logical type Key: ARROW-1261 URL: https://issues.apache.org/jira/browse/ARROW-1261 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-1260) [Plasma] Use factory method to create Python PlasmaClient

2017-07-24 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-1260: - Summary: [Plasma] Use factory method to create Python PlasmaClient Key: ARROW-1260 URL: https://issues.apache.org/jira/browse/ARROW-1260 Project: Apache Arrow

Re: CI reliability?

2017-07-24 Thread Philipp Moritz
I'm really sorry for the inconvenience! This should fix one part of the problem and make the build times a lot more tolerable: https://github.com/ apache/arrow/pull/882 (we should still fix the recomputation problem) -- Philipp. On Mon, Jul 24, 2017 at 5:24 PM, Jacques Nadeau wrote: > Got it, t

[jira] [Created] (ARROW-1259) [Plasma] Speed up Plasma tests

2017-07-24 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-1259: - Summary: [Plasma] Speed up Plasma tests Key: ARROW-1259 URL: https://issues.apache.org/jira/browse/ARROW-1259 Project: Apache Arrow Issue Type: Improvement

Re: CI reliability?

2017-07-24 Thread Jacques Nadeau
Got it, thanks Wes. I haven't been working with travis for retriggering so didn't know if there was something more elegant. On Mon, Jul 24, 2017 at 4:01 PM, Wes McKinney wrote: > That particular job is timing out because it's configured in a way > that's causing various steps to be recomputed >

[jira] [Created] (ARROW-1258) [C++] Suppress dlmalloc warnings on Clang

2017-07-24 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1258: --- Summary: [C++] Suppress dlmalloc warnings on Clang Key: ARROW-1258 URL: https://issues.apache.org/jira/browse/ARROW-1258 Project: Apache Arrow Issue Type: Impr

Re: CI reliability?

2017-07-24 Thread Wes McKinney
That particular job is timing out because it's configured in a way that's causing various steps to be recomputed https://issues.apache.org/jira/browse/ARROW-1253 I plan to fix it as soon as I can get the package builds for 0.5.0 sorted out, maybe later tonight, but have already been working aroun

CI reliability?

2017-07-24 Thread Jacques Nadeau
Hey All, I'm wondering how reliable the CI stuff has been. I just saw a ci job that was terminated due to time limits. Is that a common occurrence? Is there an easy way to retrigger? https://travis-ci.org/apache/arrow/builds/257002967 thanks, Jacques

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-24 Thread Wes McKinney
I agree those things would be nice to have. Hardening the memory format details probably would not take longer than a month or so if we were to focus in on it. Formalizing REST / RPC or IPC seems like it will be more work, or will require a design period and then initial implementation. I think ha

Re: Packaging automation for Arrow releases

2017-07-24 Thread Wes McKinney
Yes, definitely. I'll create a packaging / arrow-dist automation umbrella JIRA and attach tasks to it On Mon, Jul 24, 2017 at 4:17 PM, Jacques Nadeau wrote: > Hey Wes, > > Does it make sense to create some jiras around this and then maybe some > people can pick them up? > > On Mon, Jul 24, 2017 a

[jira] [Created] (ARROW-1257) [Plasma] Plasma documentation

2017-07-24 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-1257: - Summary: [Plasma] Plasma documentation Key: ARROW-1257 URL: https://issues.apache.org/jira/browse/ARROW-1257 Project: Apache Arrow Issue Type: Improvement

Re: Packaging automation for Arrow releases

2017-07-24 Thread Jacques Nadeau
Hey Wes, Does it make sense to create some jiras around this and then maybe some people can pick them up? On Mon, Jul 24, 2017 at 7:55 AM, Wes McKinney wrote: > We're accumulating more deployment targets and possible package > artifacts from Arrow releases, such as: > > - Source tarball > - Jav

Re: [DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-24 Thread Jacques Nadeau
Top things on my list: - Formalize Arrow RPC and/or REST - Some reference transformation algorithms - Prototype IPC On Mon, Jul 24, 2017 at 9:47 AM, Wes McKinney wrote: > hi folks, > > In recent discussions, since the Arrow memory format and metadata has > become reasonably stabilized, and we'r

[jira] [Created] (ARROW-1256) [Plasma] Fix compile warnings on macOS

2017-07-24 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-1256: - Summary: [Plasma] Fix compile warnings on macOS Key: ARROW-1256 URL: https://issues.apache.org/jira/browse/ARROW-1256 Project: Apache Arrow Issue Type: Imp

[jira] [Created] (ARROW-1255) [Plasma] Check plasma flatbuffer messages with the flatuffer verifier

2017-07-24 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-1255: - Summary: [Plasma] Check plasma flatbuffer messages with the flatuffer verifier Key: ARROW-1255 URL: https://issues.apache.org/jira/browse/ARROW-1255 Project: Apache

Re: [VOTE] Accept contribution of Plasma Object Store

2017-07-24 Thread Julien Le Dem
+1 On Sun, Jul 23, 2017 at 8:00 AM, Arun K. Subramaniyan wrote: > +1 > > On Sun, Jul 23, 2017 at 1:16 AM Uwe L. Korn wrote: > > > +1 > > > > On Fri, Jul 21, 2017, at 01:37 AM, Julian Hyde wrote: > > > +1 > > > > > > > On Jul 20, 2017, at 3:07 PM, Bryan Cutler wrote: > > > > > > > > +1 sounds g

[DISCUSS] The road from Arrow 0.5.0 to 1.0.0

2017-07-24 Thread Wes McKinney
hi folks, In recent discussions, since the Arrow memory format and metadata has become reasonably stabilized, and we're more likely to add new data types than change existing ones, we may consider making a 1.0.0 to declare to the rest of the open source world that "Arrow is open for business" and

Re: Parquet+Arrow Java

2017-07-24 Thread Wes McKinney
Currently in C++, I believe the Parquet interface produces a single record batch per read (which is generally a whole row group or a whole file with some number of columns selected). In principle, it would be better to generate a sequence of smaller record batches (e.g. with 64K rows or so). We sup

Re: Parquet+Arrow Java

2017-07-24 Thread Masayuki Takahashi
Hi Wes, I understood it thanks to the explanation. And I will refer to C ++ implementation. > but I suspect we will eventually need a "scanner" that yields a > sequence of evenly sized record batches (so individual chunks are not > too large in memory). Such an interface can be used in an asynchr

Re: Use case for R Arrow Bindings

2017-07-24 Thread Wes McKinney
+ Hadley On Fri, Jul 21, 2017 at 2:04 PM, Bryan Cutler wrote: > Thanks Clark. I know that SparkR would benefit a lot from Arrow bindings > and many people would like to see that, but to my knowledge no one has > started working on this yet. Please keep us updated with what you find! > > Bryan >

Packaging automation for Arrow releases

2017-07-24 Thread Wes McKinney
We're accumulating more deployment targets and possible package artifacts from Arrow releases, such as: - Source tarball - Java JARs - Python wheels (for pip), 3 platforms - Python conda packages (for conda), 3 platforms - .deb/.rpm packages for C++, GLib C Doing all of this manually is a lot of

[jira] [Created] (ARROW-1254) failing installation on osx / linux

2017-07-24 Thread Jeff Reback (JIRA)
Jeff Reback created ARROW-1254: -- Summary: failing installation on osx / linux Key: ARROW-1254 URL: https://issues.apache.org/jira/browse/ARROW-1254 Project: Apache Arrow Issue Type: Bug Affe