[jira] [Created] (ARROW-4819) [Java] Make JniWrapper native method be public

2019-03-10 Thread weijie.tong (JIRA)
weijie.tong created ARROW-4819: -- Summary: [Java] Make JniWrapper native method be public Key: ARROW-4819 URL: https://issues.apache.org/jira/browse/ARROW-4819 Project: Apache Arrow Issue Type: I

Read Arrow 0.9.0 output using newer pyarrow version

2019-03-10 Thread Rares Vernica
Hello, I have a C++ library using Arrow 0.9.0 to serialize data The code looks like this: std::shared_ptr arrowBatch; arrowBatch = arrow::RecordBatch::Make(_arrowSchema, nCells, _arrowArrays); std::shared_ptr arrowBuffer(new arrow::PoolBuffer(_arrowPool)); arrow::io::BufferOutputStream arrowStre

Re: OversizedAllocationException for pandas_udf in pyspark

2019-03-10 Thread Abdeali Kothari
Hi, any help on this would be much appreciated. I've not been able to figure out any reason for this to happen yet On Sat, Mar 2, 2019, 11:50 Abdeali Kothari wrote: > Hi Li Jin, thanks for the note. > > I get this error only for larger data - when I reduce the number of > records or the number o

Re: [C++] Failing constructors and internal state

2019-03-10 Thread Wes McKinney
hi Edmon, Here's an example of a function that does some schema validation: https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.cc#L450 The issue is less about the magnitude of the cost and more of a software engineering question about layering of concerns. Consider two code paths:

Re: [C++] Failing constructors and internal state

2019-03-10 Thread Edmon Begoli
Do you guys have an example somewhere of this validated vs. unvalidated code, and suspected performance impacts, and has anyone benchmarked any of this? On Sun, Mar 10, 2019 at 5:45 PM Wes McKinney wrote: > I think having consistent methods for both validated and unvalidated > construction is

Re: [C++] Failing constructors and internal state

2019-03-10 Thread Wes McKinney
I think having consistent methods for both validated and unvalidated construction is a good idea. Being fairly passionate about microperformance, I don't think we should penalize responsible users of unsafe/unvalidated APIs (e.g. by taking them away and replacing them with variants featuring unavoi

Re: [C++] Failing constructors and internal state

2019-03-10 Thread Micah Kornfield
I agree there should always be a path to avoid the validation but I think there should also be an easy way to have validation included and a clear way to tell the difference. IMO, having strong naming convention so callers can tell the difference, and code reviewers can focus more on less safe met

[jira] [Created] (ARROW-4818) [Rust] [Parquet] Parquet reader does not support null values

2019-03-10 Thread Andy Grove (JIRA)
Andy Grove created ARROW-4818: - Summary: [Rust] [Parquet] Parquet reader does not support null values Key: ARROW-4818 URL: https://issues.apache.org/jira/browse/ARROW-4818 Project: Apache Arrow

[jira] [Created] (ARROW-4817) [Rust] [DataFusion] Small re-org of modules

2019-03-10 Thread Andy Grove (JIRA)
Andy Grove created ARROW-4817: - Summary: [Rust] [DataFusion] Small re-org of modules Key: ARROW-4817 URL: https://issues.apache.org/jira/browse/ARROW-4817 Project: Apache Arrow Issue Type: Improv

[jira] [Created] (ARROW-4816) [Rust] [DataFusion] Add support for repartitioning

2019-03-10 Thread Andy Grove (JIRA)
Andy Grove created ARROW-4816: - Summary: [Rust] [DataFusion] Add support for repartitioning Key: ARROW-4816 URL: https://issues.apache.org/jira/browse/ARROW-4816 Project: Apache Arrow Issue Type

Re: [C++] Failing constructors and internal state

2019-03-10 Thread Wes McKinney
hi folks, I think some issues are being conflated here, so let me try to dig through them. Let's first look at the two cited bugs that were fixed, if I have this right: * ARROW-4766: root cause dereferencing a null pointer * ARROW-4774: root cause unsanitized Python user input None of the 4 reme

[jira] [Created] (ARROW-4815) [Rust] [DataFusion] Add support for * in SQL projection

2019-03-10 Thread Andy Grove (JIRA)
Andy Grove created ARROW-4815: - Summary: [Rust] [DataFusion] Add support for * in SQL projection Key: ARROW-4815 URL: https://issues.apache.org/jira/browse/ARROW-4815 Project: Apache Arrow Issue

Re: Assignee on Jira

2019-03-10 Thread paddy horan
Thanks Kou, appreciate it. P From: Kouhei Sutou Sent: Sunday, March 10, 2019 12:59 AM To: dev@arrow.apache.org Subject: Re: Assignee on Jira Hi, Yes. We need to add the user to the "contributor" role in JIRA to assign to the user. Adding an user to the "contrib

[jira] [Created] (ARROW-4814) [Python] Exception when writing nested columns that are tuples to parquet

2019-03-10 Thread Suvayu Ali (JIRA)
Suvayu Ali created ARROW-4814: - Summary: [Python] Exception when writing nested columns that are tuples to parquet Key: ARROW-4814 URL: https://issues.apache.org/jira/browse/ARROW-4814 Project: Apache Arr