Re: [C++] Enhancements to random Array/ChunkedArray/Table generator as a separate PR?

2021-01-31 Thread Micah Kornfield
I think it is OK to have a separate PR for random nested data generation. We wanted to do this for parquet as well, but didn't get to it. Instead we constructed a very detailed set of nesting level tests. On Sun, Jan 31, 2021 at 9:12 PM Ying Zhou wrote: > Hi, > > As a part of the process of red

Re: [C++] Shall we modify the ORC reader?

2021-01-31 Thread Micah Kornfield
It probably makes sense to make this option configurable. I think it is OK to change the default to use Maps. My guess is the initial ORC implementation predated having a Map type in the specification. On Thu, Jan 28, 2021 at 9:28 AM Ying Zhou wrote: > Hi, > > Really thanks Deepak! > > I reall

[C++] Enhancements to random Array/ChunkedArray/Table generator as a separate PR?

2021-01-31 Thread Ying Zhou
Hi, As a part of the process of reducing test size in this pull request https://github.com/apache/arrow/pull/8648 which contains the ORC writer for C++ and Python I wrote a random chunked array generator and a random table generator. To reduce test s

Re: [RUST] Arrow guide

2021-01-31 Thread Wes McKinney
To state the obvious, it would be great to have some community maintained documentation (beyond generated API docs) for the Rust library. Writing documentation almost always causes the quality of a code base to improve because the process brings up rough edges, inconsistencies, or missing features.

Re: [RUST] Arrow guide

2021-01-31 Thread Benjamin Blodgett
This is great, thanks for this! On Sun, Jan 31, 2021 at 9:25 AM Fernando Herrera < fernando.j.herr...@gmail.com> wrote: > Hi all, > > During the past months I have been trying to read and understand the code > base for the Rust implementation of Arrow. At the beginning I was just > reading the co

[RUST] Arrow guide

2021-01-31 Thread Fernando Herrera
Hi all, During the past months I have been trying to read and understand the code base for the Rust implementation of Arrow. At the beginning I was just reading the code and figuring out what each part or module was used for. Unfortunately this approach didn't work very well and had to start from

[NIGHTLY] Arrow Build Report for Job nightly-2021-01-31-0

2021-01-31 Thread Crossbow
Arrow Build Report for Job nightly-2021-01-31-0 All tasks: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-01-31-0 Failed Tasks: - conda-linux-gcc-py36-aarch64: URL: https://github.com/ursacomputing/crossbow/branches/all?query=nightly-2021-01-31-0-drone-conda-linux