Re: [C++] Arrow added to OSS-Fuzz

2020-01-16 Thread Marco Neumann
irectly to fuzzit and OSS-fuzz to not be bombarded by error messages.  Cheers,  Marco Neumann  Jan 16, 2020, 02:23 by liya.fa...@gmail.com: > Hi Antoine, > > Good job! And thanks for sharing the great news! > > Best, > Liya Fan > > On Thu, Jan 16, 2020 at 2:59 AM Antoine Pit

[jira] [Created] (ARROW-6872) [C++][Python] Empty table with dictionary-columns raises ArrowNotImplementedError

2019-10-14 Thread Marco Neumann (Jira)
Marco Neumann created ARROW-6872: Summary: [C++][Python] Empty table with dictionary-columns raises ArrowNotImplementedError Key: ARROW-6872 URL: https://issues.apache.org/jira/browse/ARROW-6872

[jira] [Created] (ARROW-6424) [C++][Fuzzing] Fuzzit nightly is broken

2019-09-03 Thread Marco Neumann (Jira)
Marco Neumann created ARROW-6424: Summary: [C++][Fuzzing] Fuzzit nightly is broken Key: ARROW-6424 URL: https://issues.apache.org/jira/browse/ARROW-6424 Project: Apache Arrow Issue Type: Bug

[jira] [Created] (ARROW-6273) [C++][Fuzzing] Add fuzzer for parquet->arrow read path

2019-08-16 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-6273: Summary: [C++][Fuzzing] Add fuzzer for parquet->arrow read path Key: ARROW-6273 URL: https://issues.apache.org/jira/browse/ARROW-6273 Project: Apache Ar

[jira] [Created] (ARROW-6270) [C++][Fuzzing] IPC reads do not check buffer indices

2019-08-16 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-6270: Summary: [C++][Fuzzing] IPC reads do not check buffer indices Key: ARROW-6270 URL: https://issues.apache.org/jira/browse/ARROW-6270 Project: Apache Arrow

[jira] [Created] (ARROW-6269) [C++][Fuzzing] IPC reads do not check decimal precision

2019-08-16 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-6269: Summary: [C++][Fuzzing] IPC reads do not check decimal precision Key: ARROW-6269 URL: https://issues.apache.org/jira/browse/ARROW-6269 Project: Apache Arrow

[jira] [Created] (ARROW-5990) RowGroupMetaData.column misses bounds check

2019-07-19 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5990: Summary: RowGroupMetaData.column misses bounds check Key: ARROW-5990 URL: https://issues.apache.org/jira/browse/ARROW-5990 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-5987) [C++][Fuzzing] arrow-ipc-fuzzing-test crash 3c3f1b74f347ec6c8b0905e7126b9074b9dc5564

2019-07-19 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5987: Summary: [C++][Fuzzing] arrow-ipc-fuzzing-test crash 3c3f1b74f347ec6c8b0905e7126b9074b9dc5564 Key: ARROW-5987 URL: https://issues.apache.org/jira/browse/ARROW-5987

[jira] [Created] (ARROW-5959) [C++][CI] Fuzzit does not know about branch + commit hash

2019-07-16 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5959: Summary: [C++][CI] Fuzzit does not know about branch + commit hash Key: ARROW-5959 URL: https://issues.apache.org/jira/browse/ARROW-5959 Project: Apache Arrow

[jira] [Created] (ARROW-5921) [C++][Fuzzing] Missing nullptr checks in IPC

2019-07-12 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5921: Summary: [C++][Fuzzing] Missing nullptr checks in IPC Key: ARROW-5921 URL: https://issues.apache.org/jira/browse/ARROW-5921 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-5607) [C++][Fuzzing] arrow-ipc-fuzzing-test crash 607e9caa76863a97f2694a769a1ae2fb83c55e02

2019-06-14 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5607: Summary: [C++][Fuzzing] arrow-ipc-fuzzing-test crash 607e9caa76863a97f2694a769a1ae2fb83c55e02 Key: ARROW-5607 URL: https://issues.apache.org/jira/browse/ARROW-5607

[jira] [Created] (ARROW-5605) [C++][Fuzzing] arrow-ipc-fuzzing-test crash 74aec871d14bb6b07c72ea8f0e8c9f72cbe6b73c

2019-06-14 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5605: Summary: [C++][Fuzzing] arrow-ipc-fuzzing-test crash 74aec871d14bb6b07c72ea8f0e8c9f72cbe6b73c Key: ARROW-5605 URL: https://issues.apache.org/jira/browse/ARROW-5605

[jira] [Created] (ARROW-5593) [C++][Fuzzing] Test fuzzers against arrow-testing corpus

2019-06-13 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5593: Summary: [C++][Fuzzing] Test fuzzers against arrow-testing corpus Key: ARROW-5593 URL: https://issues.apache.org/jira/browse/ARROW-5593 Project: Apache Arrow

[jira] [Created] (ARROW-5589) ipc-fuzzing-test crash 2354085db0125113f04f7bd23f54b85cca104713

2019-06-13 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5589: Summary: ipc-fuzzing-test crash 2354085db0125113f04f7bd23f54b85cca104713 Key: ARROW-5589 URL: https://issues.apache.org/jira/browse/ARROW-5589 Project: Apache Arrow

[jira] [Created] (ARROW-5525) Enable continuous fuzzing

2019-06-07 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5525: Summary: Enable continuous fuzzing Key: ARROW-5525 URL: https://issues.apache.org/jira/browse/ARROW-5525 Project: Apache Arrow Issue Type: Test

[jira] [Created] (ARROW-5166) [Python] Statistics for uint64 columns may overflow

2019-04-12 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5166: Summary: [Python] Statistics for uint64 columns may overflow Key: ARROW-5166 URL: https://issues.apache.org/jira/browse/ARROW-5166 Project: Apache Arrow

[jira] [Created] (ARROW-5028) Arrow->Parquet store drops and corrupts values

2019-03-27 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-5028: Summary: Arrow->Parquet store drops and corrupts values Key: ARROW-5028 URL: https://issues.apache.org/jira/browse/ARROW-5028 Project: Apache Arrow Is

Re: [Rust] crate versions and release process

2019-01-06 Thread Marco Neumann
The main question is: do we want independent reales (for security and severe bugs) or not? If so, what about the following versioning scheme: Major.Minor.Fix Major and minor are in-sync for all languages and libraries/crates and can be featured using in blog posts and other channels. Fix relea

Re: [Rust] crate versions and release process

2019-01-04 Thread Marco Neumann
+1 The only thing to keep in mind is that versions are statement regarding API stability (aka semantic versioning). It is easy to forget about these things in a monorepo since you can fix all the breaking changes in the PR they got introduced. So whoever cuts the release must account for that an

Re: [RUST] [DISCUSS] Changing type of array lengths

2018-12-07 Thread Marco Neumann
On windows it depends if it's a 32 or 64 bit binary, like on every other system as well. usize is usually used by Rust containers for indexing (see for example Vec in the standard library) and I found it personally very annoying if libraries break that rule, because in Rust you have to be expli

Re: [RUST] [DISCUSS] Changing type of array lengths

2018-12-07 Thread Marco Neumann
One question here is: do we want to support datasets with more than 4G entries on 32bit systems? If so, how would this even be possible (since you cannot just fit that much data in any addressable memory chunk in Rust)? So I would say: usize is idiomic and supports large enough datasets on the

Re: Rust nightly + formatting changes

2018-12-04 Thread Marco Neumann
rustfmt will hit 1.0 tomorrow (Rust 1.31 release) and ensures stable formatting over multiple releases. As far as I understood, this "formatting style stability" will affect all channels, so we could run rustfmt on the same channel(s) as our CI. Might be that we need a jira ticket to change from

Re: [VOTE] Accept donation of Rust Parquet implementation

2018-12-02 Thread Marco Neumann
+1 Happy to see this contribution :-) On December 1, 2018 12:50:49 AM GMT+01:00, Wes McKinney wrote: >Dear all, > >The developers of > >https://github.com/sunchao/parquet-rs > >have been in touch with Apache Arrow and Apache Parquet. Based on >mailing list discussions, it is being proposed to d

Re: [VOTE] Release Apache Arrow 0.10.0 (RC0)

2018-08-03 Thread Marco Neumann
the next RC (RC1) if the code >described in https://issues.apache.org/jira/browse/ARROW-2963 works >properly after the fix. Please let us know tomorrow if you run into >any more issues. Thank you! > >On Thu, Aug 2, 2018 at 3:38 PM, Marco Neumann > wrote: >> I'll test the

Re: [VOTE] Release Apache Arrow 0.10.0 (RC0)

2018-08-02 Thread Marco Neumann
I'll test the PR tomorrow (Friday, until 15:00 UTC). Thanks for the quick fix! @Wes Might be doable, I'll check how we can improve there. Sorry for catching this problem that late. I'm totally fine with the "no veto" policy. It's a bug for which no test existed beforehand, and a behavior / fea

Re: [VOTE] Release Apache Arrow 0.10.0 (RC0)

2018-08-02 Thread Marco Neumann
+0 Because of https://issues.apache.org/jira/browse/ARROW-2963 (not sure if this counts as a blocker) Marco On August 1, 2018 7:35:50 PM GMT+02:00, Phillip Cloud wrote: >Hello all, > >I'd like to propose the 1st release candidate (rc0) of Apache Arrow >version >0.10.0. This is a major release

[jira] [Created] (ARROW-2963) [Python] Deadlock during fork-join and use_threads=True

2018-08-02 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-2963: Summary: [Python] Deadlock during fork-join and use_threads=True Key: ARROW-2963 URL: https://issues.apache.org/jira/browse/ARROW-2963 Project: Apache Arrow

Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Marco Neumann
Hey, first of all, thanks a lot for your, Uwes, the mergers and contributors work. Now, to the maintainer problem: # Arrow as "a library" One thing that makes Arrow special is that it is not a single, but many libraries (one for each language) and many of them are not only a binding to a C/C++ li

Re: Recruiting more maintainers for Apache Arrow

2018-06-30 Thread Marco Neumann
Hey, first of all, thanks a lot for your, Uwes, the mergers and contributors work. Now, to the maintainer problem: # Arrow as "a library" One thing that makes Arrow special is that it is not a single, but many libraries (one for each language) and many of them are not only a binding to a C/C++ li

[jira] [Created] (ARROW-2554) pa.array type inference bug when using NS-timestamp

2018-05-08 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-2554: Summary: pa.array type inference bug when using NS-timestamp Key: ARROW-2554 URL: https://issues.apache.org/jira/browse/ARROW-2554 Project: Apache Arrow

[jira] [Created] (ARROW-2513) [Python] DictionaryType should give access to index type and dictionary array

2018-04-26 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-2513: Summary: [Python] DictionaryType should give access to index type and dictionary array Key: ARROW-2513 URL: https://issues.apache.org/jira/browse/ARROW-2513 Project

[jira] [Created] (ARROW-1589) Fuzzing for certain input formats

2017-09-21 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-1589: Summary: Fuzzing for certain input formats Key: ARROW-1589 URL: https://issues.apache.org/jira/browse/ARROW-1589 Project: Apache Arrow Issue Type: Test

[jira] [Created] (ARROW-1276) Cannot serializer empty DataFrame to parquet

2017-07-26 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-1276: Summary: Cannot serializer empty DataFrame to parquet Key: ARROW-1276 URL: https://issues.apache.org/jira/browse/ARROW-1276 Project: Apache Arrow Issue Type

[jira] [Created] (ARROW-1083) Object categoricals are not serialized when only None is present

2017-06-02 Thread Marco Neumann (JIRA)
Marco Neumann created ARROW-1083: Summary: Object categoricals are not serialized when only None is present Key: ARROW-1083 URL: https://issues.apache.org/jira/browse/ARROW-1083 Project: Apache Arrow