Re: Tungsten off heap memory access for C++ libraries

2016-04-28 Thread jpivar...@gmail.com
jpivar...@gmail.com wrote > P.S. Concerning Java/C++ bindings, there are many. I tried JNI, JNA, > BridJ, and JavaCPP personally, but in the end picked JNA because of its > (comparatively) large user base. If Spark will be using Djinni, that could > be a symmetry-breaking consideration and I'll sta

Re: Tungsten off heap memory access for C++ libraries

2016-04-28 Thread jpivar...@gmail.com
Hi, I'm coming from the particle physics community and I'm also very interested in the development of this project. We have a huge C++ codebase and would like to start using the higher-level abstractions of Spark in our data analyses. To this end, I've been developing code that copies data from ou

Re: Tungsten off heap memory access for C++ libraries

2015-10-01 Thread Paul Wais
Update for those who are still interested: djinni is a nice tool for generating Java/C++ bindings. Before today djinni's Java support was only aimed at Android, but now djinni works with (at least) Debian, Ubuntu, and CentOS. djinni will help you run C++ code in-process with the caveat that djinn

Re: Tungsten off heap memory access for C++ libraries

2015-09-01 Thread Paul Weiss
https://issues.apache.org/jira/browse/SPARK-10399 Is the jira to track. On Sep 1, 2015 5:32 PM, "Paul Wais" wrote: > Paul: I've worked on running C++ code on Spark at scale before (via JNA, > ~200 > cores) and am working on something more contribution-oriented now (via > JNI). > A few comments:

Re: Tungsten off heap memory access for C++ libraries

2015-09-01 Thread Paul Wais
Paul: I've worked on running C++ code on Spark at scale before (via JNA, ~200 cores) and am working on something more contribution-oriented now (via JNI). A few comments: * If you need something *today*, try JNA. It can be slow (e.g. a short native function in a tight loop) but works if you have

Re: Tungsten off heap memory access for C++ libraries

2015-09-01 Thread Reynold Xin
Please do. Thanks. On Mon, Aug 31, 2015 at 5:00 AM, Paul Weiss wrote: > Sounds good, want me to create a jira and link it to SPARK-9697? Will put > down some ideas to start. > On Aug 31, 2015 4:14 AM, "Reynold Xin" wrote: > >> BTW if you are interested in this, we could definitely get some help

Re: Tungsten off heap memory access for C++ libraries

2015-08-31 Thread Paul Weiss
Sounds good, want me to create a jira and link it to SPARK-9697? Will put down some ideas to start. On Aug 31, 2015 4:14 AM, "Reynold Xin" wrote: > BTW if you are interested in this, we could definitely get some help in > terms of prototyping the feasibility, i.e. how we can have a native (e.g. >

Re: Tungsten off heap memory access for C++ libraries

2015-08-31 Thread Reynold Xin
BTW if you are interested in this, we could definitely get some help in terms of prototyping the feasibility, i.e. how we can have a native (e.g. C++) API for data access shipped with Spark. There are a lot of questions (e.g. build, portability) that need to be answered. On Mon, Aug 31, 2015 at 1:

Re: Tungsten off heap memory access for C++ libraries

2015-08-31 Thread Reynold Xin
On Sun, Aug 30, 2015 at 5:58 AM, Paul Weiss wrote: > > Also, is this work being done on a branch I could look into further and > try out? > > We don't have a branch yet -- because there is no code nor design for this yet. As I said, it is one of the motivations behind Tungsten, but it is fairly e

Re: Tungsten off heap memory access for C++ libraries

2015-08-30 Thread Paul Weiss
Reynold, That is great to hear. Definitely interested in how 2. is being implemented and how it will be exposed in C++. One important aspect of leveraging the off heap memory is how the data is organized as well as being able to easily access it from the C++ side. For example how would you stor

Re: Tungsten off heap memory access for C++ libraries

2015-08-29 Thread Reynold Xin
Supporting non-JVM code without memory copying and serialization is actually one of the motivations behind Tungsten. We didn't talk much about it since it is not end-user-facing and it is still too early. There are a few challenges still: 1. Spark cannot run entirely in off-heap mode (by entirely

Re: Tungsten off heap memory access for C++ libraries

2015-08-29 Thread Timothy Chen
I would also like to see data shared off-heap to a 3rd party C++ library with JNI, I think the complications would be how to memory manage this and make sure the 3rd party libraries also adhere to the access contracts as well. Tim On Sat, Aug 29, 2015 at 12:17 PM, Paul Weiss wrote: > Hi, > > Wou