Re: Java Examples Writing Flatbuffer in IPC Message

2017-09-18 Thread Li Jin
Here is the code Wes is referring to: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala#L73 This turns Spark rows into Arrow file format. Here is the reading part in python: https://github.com/apache/spark/blob/master/py

Re: Java Examples Writing Flatbuffer in IPC Message

2017-09-18 Thread Wes McKinney
I would suggest you take a look at the Arrow converter in Spark (search in the codebase), or one of the other developers or users may be able to respond on the list. On Fri, Sep 15, 2017 at 11:41 AM, Andrew Pham (BLOOMBERG/ 731 LEX) wrote: > Thanks Wes. I'm going over the Java source code and I

Re: Next Arrow sync call

2017-09-18 Thread Aneesh Karve
Greetings, Regarding a "champion to bootstrap R bindings," I'm looking to hire an R/C++ contractor (or FT) as an open source contributor to Arrow, given R's strategic importance to our data package manager . If you know anyone who's qualified, please send them my way. Her

[jira] [Created] (ARROW-1553) Implement setInitialCapacity for MapWriter and pass on this capacity during lazy creation of child vectors

2017-09-18 Thread Siddharth Teotia (JIRA)
Siddharth Teotia created ARROW-1553: --- Summary: Implement setInitialCapacity for MapWriter and pass on this capacity during lazy creation of child vectors Key: ARROW-1553 URL: https://issues.apache.org/jira/brows

[jira] [Created] (ARROW-1552) [C++] Enable Arrow production builds on Linux / macOS without Boost dependency

2017-09-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1552: --- Summary: [C++] Enable Arrow production builds on Linux / macOS without Boost dependency Key: ARROW-1552 URL: https://issues.apache.org/jira/browse/ARROW-1552 Project: A

[jira] [Created] (ARROW-1551) [Website] Updates for 0.7.0 release

2017-09-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1551: --- Summary: [Website] Updates for 0.7.0 release Key: ARROW-1551 URL: https://issues.apache.org/jira/browse/ARROW-1551 Project: Apache Arrow Issue Type: Improvemen

Re: Decimal Format

2017-09-18 Thread Wes McKinney
hi Phillip, Thanks for bringing up these questions. I have a few questions about this: 1. What format is parquet-mr (e.g. Hive, Spark) writing for various precisions? Is it always 16 bytes? My understanding is that they use BYTE_ARRAY instead of FIXED_LEN_BYTE_ARRAY in parquet-mr for decimals 2

Decimal Format

2017-09-18 Thread Phillip Cloud
Hi all, I’d like to propose the following changes to the in-memory Decimal128 format and solicit feedback. 1. When converting to and from an array of bytes, the input bytes are *assumed* to be in big-endian order and the output bytes are *guaranteed* to be in big-endian order. Additional

[jira] [Created] (ARROW-1550) [Python] Fix flaky test on Windows

2017-09-18 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-1550: --- Summary: [Python] Fix flaky test on Windows Key: ARROW-1550 URL: https://issues.apache.org/jira/browse/ARROW-1550 Project: Apache Arrow Issue Type: Bug