I will try to review tomorrow and cast a vote.
On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney wrote:
> There is only 1 binding +1 vote so far, we should probably wait for
> three before closing the vote (it's possible that lazy consensus could
> be employed here but not much harm in waiting a few
Vladimir created ARROW-7867:
---
Summary: ArrowIOError: Invalid Parquet file size is 0 bytes on
reading from S3
Key: ARROW-7867
URL: https://issues.apache.org/jira/browse/ARROW-7867
Project: Apache Arrow
Istvan Szukacs created ARROW-7866:
-
Summary: [Rust] How to handle aggregates with Datafusion?
Key: ARROW-7866
URL: https://issues.apache.org/jira/browse/ARROW-7866
Project: Apache Arrow
Issue
I should note, it isn't necessarily just the extra metadata. For single
row values, there is also an overhead for padding requirements. You should
be able to measure this by looking at the size of the buffer you are using
before writing any batches to the stream (I believe the schema is written
e
Hi all,
I could understand the use of Arrow in our projects to have
inter-operability as well as faster access. I have couple of questions on
how we can use for the following usecase and whether is it a good way of
usage,
1. Will the Spark execution be faster when I use joins on DF with Arrow
com
Arrow Build Report for Job nightly-2020-02-16-0
All tasks:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-16-0
Failed Tasks:
- centos-7:
URL:
https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-02-16-0-azure-centos-7
- conda-linux-gcc-py27:
URL: