[jira] [Resolved] (ARROW-256) Add versioning to the arrow spec.

2016-09-08 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-256. Resolution: Fixed Issue resolved by pull request 125 [https://github.com/apache/arrow/pull/125] > Ad

[jira] [Resolved] (ARROW-286) Build thirdparty dependencies in parallel

2016-09-08 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-286. Resolution: Fixed Issue resolved by pull request 133 [https://github.com/apache/arrow/pull/133] > Bu

Re: Arrow package for Python

2016-09-08 Thread Julien Le Dem
Thanks Uwe, I've been able to try it from the instructions in the Readme. generating a parquet file from python and reading it with parquet-tools (in java). On Wed, Sep 7, 2016 at 11:36 PM, Uwe Korn wrote: > Hello Julien, > > what I have mentioned in the sync up was that I'm able to build a bina

Re: Arrow File with Multiple Record Batches

2016-09-08 Thread Brian Hulette
Ah got it, thanks Julien. I was thinking that each RecordBatch could have different schemas, which in retrospect doesn't seem very logical. In essence I guess I was thinking each record batch was a partition of the schema's fields, instead of a partition of the entire dataset. Thanks for clea

Re: Arrow File with Multiple Record Batches

2016-09-08 Thread Julien Le Dem
Hi Brian, It's not one record batch per field. Each field describes a column in the schema. Record batches are partitions of the dataset. As such all record batches have the same schema which is defined in the footer. There can be any number of record batches for a given schema. Then in each recor

[jira] [Created] (ARROW-286) Build thirdparty dependencies in parallel

2016-09-08 Thread Uwe L. Korn (JIRA)
Uwe L. Korn created ARROW-286: - Summary: Build thirdparty dependencies in parallel Key: ARROW-286 URL: https://issues.apache.org/jira/browse/ARROW-286 Project: Apache Arrow Issue Type: Improvemen

[jira] [Resolved] (ARROW-285) Allow for custom flatc compiler

2016-09-08 Thread Julien Le Dem (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem resolved ARROW-285. - Resolution: Fixed Issue resolved by pull request 129 [https://github.com/apache/arrow/pull/129] >

[jira] [Updated] (ARROW-285) Allow for custom flatc compiler

2016-09-08 Thread Laurent Goujon (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laurent Goujon updated ARROW-285: - Priority: Minor (was: Major) > Allow for custom flatc compiler > --- >

[jira] [Created] (ARROW-285) Allow for custom flatc compiler

2016-09-08 Thread Laurent Goujon (JIRA)
Laurent Goujon created ARROW-285: Summary: Allow for custom flatc compiler Key: ARROW-285 URL: https://issues.apache.org/jira/browse/ARROW-285 Project: Apache Arrow Issue Type: Improvement

Arrow File with Multiple Record Batches

2016-09-08 Thread Brian Hulette
Hi all, I'm very interested in the Arrow file format - I would eventually like to use it to export data in a columnar format that can be read directly in a browser through a Javascript library. I've been reviewing the specification and Julien's Java implementation, and I'm a little bit confused a