Re: [Discuss] [Java] DateMilliVector.getObject() return type (LocalDateTime vs LocalDate)

2019-09-16 Thread Micah Kornfield
Anyone have an opinion on this? Personally, I'm leaning on keeping the existing API compatibility, but I don't feel too strongly about it. On Mon, Sep 9, 2019 at 7:39 PM Micah Kornfield wrote: > Yongbo Zhang, > Opened up a pull request to have DateMilliVector return a LocalDate > instead of a L

Re: [DISCUSS][Java] Design of the algorithm module

2019-09-16 Thread Micah Kornfield
Hi Liya Fan, Thank you for this writeup, it doesn't look like comments are enabled on the document. Could you allow for them? Thanks, Micah On Sat, Sep 14, 2019 at 6:57 AM Fan Liya wrote: > Dear all, > > We have prepared a document for discussing the requirements, design and > implementation i

Re: [DISCUSS] Improving Arrow columnar implementation guidelines for third parties

2019-09-16 Thread Micah Kornfield
1. Are there particular issues that have cropped up that we should be aware of? This might help inform how we go about this. 2. We should be publishing a matrix of current compliance with the standard for our existing implementations (this could be the basis of letting bespoke implementations cl

Re: [DISCUSS][C++] Rethinking our current C++ shared library (.so / .dll) approach

2019-09-16 Thread Micah Kornfield
I don't have a strong opinion here, but had a question and comment: Are there are implications from a project governance perspective of packaging Parquet and Arrow into a single shared library? As a comment, but I'm a big +1 on trying to tease apart the circular dependencies between Parquet/Arrow

[jira] [Created] (ARROW-6576) [R] Fix sparklyr integration tests

2019-09-16 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-6576: -- Summary: [R] Fix sparklyr integration tests Key: ARROW-6576 URL: https://issues.apache.org/jira/browse/ARROW-6576 Project: Apache Arrow Issue Type: Bug

Re: [DISCUSS][C++] Rethinking our current C++ shared library (.so / .dll) approach

2019-09-16 Thread Sutou Kouhei
Hi, If this is circular, it's a problem. But this isn't circular for now. I think that we can use libarrow as the fundamental shared library to provide common implementation like [1] if we need to provide common implementation for template. (I think that we don't provide common implementation for

Re: [DISCUSS][C++] Rethinking our current C++ shared library (.so / .dll) approach

2019-09-16 Thread Sutou Kouhei
Hi, I understand what problems we want to solve. Especially template and DLL in ARROW-6244. I feel that one shared library is overkill because we have many namespaces. If we have only arrow:: namespace, it's reasonable. But we have arrow::, gandiva::, parquet:: and plasma:: namespaces. It's a bit

[jira] [Created] (ARROW-6575) [JS] decimal toString does not support negative values

2019-09-16 Thread Andong Zhan (Jira)
Andong Zhan created ARROW-6575: -- Summary: [JS] decimal toString does not support negative values Key: ARROW-6575 URL: https://issues.apache.org/jira/browse/ARROW-6575 Project: Apache Arrow Issue

[Rust] DataFusion parallel query execution update

2019-09-16 Thread Andy Grove
I wanted to give a quick update to add some context to the work I am doing to add parallel query execution to DataFusion since I have been working on this largely in isolation. The current query execution code in DataFusion 0.14 is single-threaded and can only run against a single CSV or Parquet f

[jira] [Created] (ARROW-6574) [JS] TypeError with utf8 and JSONVectorLoader.readData

2019-09-16 Thread Adam M Krebs (Jira)
Adam M Krebs created ARROW-6574: --- Summary: [JS] TypeError with utf8 and JSONVectorLoader.readData Key: ARROW-6574 URL: https://issues.apache.org/jira/browse/ARROW-6574 Project: Apache Arrow Iss

[jira] [Created] (ARROW-6573) Segfault when writing to parquet

2019-09-16 Thread Josh Weinstock (Jira)
Josh Weinstock created ARROW-6573: - Summary: Segfault when writing to parquet Key: ARROW-6573 URL: https://issues.apache.org/jira/browse/ARROW-6573 Project: Apache Arrow Issue Type: Bug

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-09-16-0

2019-09-16 Thread Krisztián Szűcs
That is a crossbow failure related to the artefact uploading, could be a temporary network issue. Restarted the build, if it occurs again, then I'll investigate whether we can increase its timeout. On Mon, Sep 16, 2019 at 5:12 PM Wes McKinney wrote: > Weird that the ubuntu-cosmic build failed ag

[jira] [Created] (ARROW-6572) [C++] Reading some Parquet data can return uninitialized memory

2019-09-16 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-6572: - Summary: [C++] Reading some Parquet data can return uninitialized memory Key: ARROW-6572 URL: https://issues.apache.org/jira/browse/ARROW-6572 Project: Apache Arrow

Re: [NIGHTLY] Arrow Build Report for Job nightly-2019-09-16-0

2019-09-16 Thread Wes McKinney
Weird that the ubuntu-cosmic build failed again. Is something possibly wrong with that build? https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=1073 On Mon, Sep 16, 2019 at 7:01 AM Crossbow wrote: > > > Arrow Build Report for Job nightly-2019-09-16-0 > > All tasks: > https://githu

[DISCUSS] Improving Arrow columnar implementation guidelines for third parties

2019-09-16 Thread Wes McKinney
hi folks, As Apache Arrow grows more popular, we may acquire some different kinds of third party developers: A. Developers who use and, in many cases, contribute to one of the project's reference implementations B. Developers who choose to implement the columnar format themselves, without depend

[jira] [Created] (ARROW-6571) [Developer] Provide means to "plug in" a third party Arrow implementation into the integration test suite for validation purposes

2019-09-16 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6571: --- Summary: [Developer] Provide means to "plug in" a third party Arrow implementation into the integration test suite for validation purposes Key: ARROW-6571 URL: https://issues.apache

[jira] [Created] (ARROW-6570) [Python] Use MemoryPool to allocate memory for NumPy arrays in to_pandas calls

2019-09-16 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-6570: --- Summary: [Python] Use MemoryPool to allocate memory for NumPy arrays in to_pandas calls Key: ARROW-6570 URL: https://issues.apache.org/jira/browse/ARROW-6570 Project: A

[NIGHTLY] Arrow Build Report for Job nightly-2019-09-16-0

2019-09-16 Thread Crossbow
Arrow Build Report for Job nightly-2019-09-16-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-16-0 Failed Tasks: - ubuntu-cosmic: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2019-09-16-0-azure-ubuntu-cosmic - docker-spark-integ