[jira] [Created] (ARROW-2449) [Python] Efficiently serialize functions containing NumPy arrays

2018-04-11 Thread Richard Shin (JIRA)
Richard Shin created ARROW-2449: --- Summary: [Python] Efficiently serialize functions containing NumPy arrays Key: ARROW-2449 URL: https://issues.apache.org/jira/browse/ARROW-2449 Project: Apache Arrow

Re: [DRAFT] Apache Arrow board report April 2018

2018-04-11 Thread Jacques Nadeau
Thanks Wes and Uwe. I added a couple more activity items around the build work and the intention to faster Rust releases. Initial posted report below. Let me know if you think we should wordsmith further. ## Description: Apache Arrow is a cross-language development platform for in-memory data. It

RE: Correct way to set NULL values in VarCharVector (Java API)?

2018-04-11 Thread Atul Dambalkar
Hi Sid, Emilio, Need some more help. Here is how I am using the NullableVarCharHolder - -- String value = "some text string"; NullableVarCharHolder holder = new NullableVarCharHolder(); holder.isSet = 1; byte[] bytes = value.getBytes(StandardCha

[jira] [Created] (ARROW-2448) Segfault when plasma client goes out of scope before buffer.

2018-04-11 Thread Robert Nishihara (JIRA)
Robert Nishihara created ARROW-2448: --- Summary: Segfault when plasma client goes out of scope before buffer. Key: ARROW-2448 URL: https://issues.apache.org/jira/browse/ARROW-2448 Project: Apache Arro

RE: Correct way to set NULL values in VarCharVector (Java API)?

2018-04-11 Thread Atul Dambalkar
Thanks Sid and Emilio. I think, this can be extended to pretty much all the SQL and corresponding Arrow data types. -Atul -Original Message- From: Siddharth Teotia [mailto:siddha...@dremio.com] Sent: Wednesday, April 11, 2018 10:27 AM To: dev@arrow.apache.org Subject: Re: Correct way t

Re: Correct way to set NULL values in VarCharVector (Java API)?

2018-04-11 Thread Siddharth Teotia
Another option is to use the set() API that allows you to indicate whether the value is NULL or not using an isSet parameter (0 for NULL, 1 otherwise). This is similar to holder based APIs where you need to indicate in holder.isSet whether value is NULL or not. https://github.com/apache/arrow/blob

Re: [DRAFT] Apache Arrow board report April 2018

2018-04-11 Thread Uwe L. Korn
We should add the start of a Rust implementation to Activity, otherwise this looks good. Uwe On Wed, Apr 11, 2018, at 7:14 PM, Wes McKinney wrote: > ## Description: > > Apache Arrow is a cross-language development platform for in-memory data. It > specifies a standardized language-independent c

[DRAFT] Apache Arrow board report April 2018

2018-04-11 Thread Wes McKinney
## Description: Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries

Board report

2018-04-11 Thread Jacques Nadeau
Hey All, I'm wrapped up at a conference today and will pull together a draft tonight. If any wants to make a start, it would be much appreciated. thanks

Continuous benchmarking setup

2018-04-11 Thread Antoine Pitrou
Hello With the following changes, it seems we might reach the point where we're able to run the Python-based benchmark suite accross multiple commits (at least the ones not anterior to those changes): https://github.com/apache/arrow/pull/1775 To make this truly useful, we would need a dedicated

Re: What do people think about a one day get together?

2018-04-11 Thread Sourav Mazumder
+ 1 I would love to attend too. I would be there in Spark summit and presenting too there. Regards, Sourav Mazumder Data Science Center of Competency IBM Analytics On Mon, Apr 9, 2018 at 10:23 AM, Julian Hyde wrote: > +1 The Arrow community would benefit greatly from a > conference/unconferenc

Re: Correct way to set NULL values in VarCharVector (Java API)?

2018-04-11 Thread Emilio Lahr-Vivaz
Hi Atul, You should be able to use the overloaded 'set' method that takes a NullableVarCharHolder: https://github.com/apache/arrow/blob/master/java/vector/src/main/java/org/apache/arrow/vector/VarCharVector.java#L237 Thanks, Emilio On 04/10/2018 05:23 PM, Atul Dambalkar wrote: Hi, I wante

Re: Buffer slices are unsafe

2018-04-11 Thread Antoine Pitrou
Hi Dimitri, Le 11/04/2018 à 13:42, Dimitri Vorona a écrit : > > I was thinking about something like this [0]. The point is, that the slice > user has no way of knowing if the slice can still be safely used and who > owns the memory. I think the answer is that calling free() on something you exp

Re: Buffer slices are unsafe

2018-04-11 Thread Dimitri Vorona
Hi Antoine, I was thinking about something like this [0]. The point is, that the slice user has no way of knowing if the slice can still be safely used and who owns the memory. You are right, of course, that it wouldn't be thread safe and we'd need a locking mechanism which prevents de-/reallocat

[jira] [Created] (ARROW-2447) [C++] Create a device abstraction

2018-04-11 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2447: - Summary: [C++] Create a device abstraction Key: ARROW-2447 URL: https://issues.apache.org/jira/browse/ARROW-2447 Project: Apache Arrow Issue Type: Improvem

Re: Buffer slices are unsafe

2018-04-11 Thread Antoine Pitrou
Hi Dimitri, Le 11/04/2018 à 12:28, Dimitri Vorona a écrit : > > I think, it comes down to the memory ownership. While Buffer apparently > never owns it's memory (based on the doc string), a MutableBuffer can. So > if you slice a MutableBuffer, and the memory gets deallocated, you've got > the sa

Re: Buffer slices are unsafe

2018-04-11 Thread Dimitri Vorona
Hi Antoine, > AFAIU, the problem only exists with ResizableBuffer? I think, it comes down to the memory ownership. While Buffer apparently never owns it's memory (based on the doc string), a MutableBuffer can. So if you slice a MutableBuffer, and the memory gets deallocated, you've got the same p

[jira] [Created] (ARROW-2446) [C++] SliceBuffer on CudaBuffer should return CudaBuffer

2018-04-11 Thread Antoine Pitrou (JIRA)
Antoine Pitrou created ARROW-2446: - Summary: [C++] SliceBuffer on CudaBuffer should return CudaBuffer Key: ARROW-2446 URL: https://issues.apache.org/jira/browse/ARROW-2446 Project: Apache Arrow

Re: Build system discussion for Arrow (and Orc?)

2018-04-11 Thread Uwe L. Korn
Hello Michael, > 1. Is this a welcome change, or should we just carry patches locally? These changes would be very welcome. The current vendoring approach exists for all dependencies mostly to get have a smooth development experience. It is not meant for releases. The current approach for ORC i

Re: Buffer slices are unsafe

2018-04-11 Thread Antoine Pitrou
Hi Dimitri, Le 11/04/2018 à 09:02, Dimitri Vorona a écrit : > Hi everybody, > > to continue the discussion in [0]: right now this [1] can happen and the > sliced buffer has no way to foresee or to check against it beforehand. > > I'd suggest to create a new class SlicedBuffer, which would refer

Buffer slices are unsafe

2018-04-11 Thread Dimitri Vorona
Hi everybody, to continue the discussion in [0]: right now this [1] can happen and the sliced buffer has no way to foresee or to check against it beforehand. I'd suggest to create a new class SlicedBuffer, which would reference the parent buffer and return it's data() pointer, insted of grabbing