Re: [VOTE] Adopt Arrow in-process C Data Interface specification

2020-02-16 Thread Micah Kornfield
I will try to review tomorrow and cast a vote. On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney wrote: > There is only 1 binding +1 vote so far, we should probably wait for > three before closing the vote (it's possible that lazy consensus could > be employed here but not much harm in waiting a few

[jira] [Created] (ARROW-7867) ArrowIOError: Invalid Parquet file size is 0 bytes on reading from S3

2020-02-16 Thread Vladimir (Jira)
Vladimir created ARROW-7867: --- Summary: ArrowIOError: Invalid Parquet file size is 0 bytes on reading from S3 Key: ARROW-7867 URL: https://issues.apache.org/jira/browse/ARROW-7867 Project: Apache Arrow

[jira] [Created] (ARROW-7866) [Rust] How to handle aggregates with Datafusion?

2020-02-16 Thread Istvan Szukacs (Jira)
Istvan Szukacs created ARROW-7866: - Summary: [Rust] How to handle aggregates with Datafusion? Key: ARROW-7866 URL: https://issues.apache.org/jira/browse/ARROW-7866 Project: Apache Arrow Issue

Re: Schemaless serialization

2020-02-16 Thread Micah Kornfield
I should note, it isn't necessarily just the extra metadata. For single row values, there is also an overhead for padding requirements. You should be able to measure this by looking at the size of the buffer you are using before writing any batches to the stream (I believe the schema is written e

Basic question on Apache Arrow

2020-02-16 Thread Subash Prabakar
Hi all, I could understand the use of Arrow in our projects to have inter-operability as well as faster access. I have couple of questions on how we can use for the following usecase and whether is it a good way of usage, 1. Will the Spark execution be faster when I use joins on DF with Arrow com

[NIGHTLY] Arrow Build Report for Job nightly-2020-02-16-0

2020-02-16 Thread Crossbow
Arrow Build Report for Job nightly-2020-02-16-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-16-0 Failed Tasks: - centos-7: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-16-0-azure-centos-7 - conda-linux-gcc-py27: URL: