Re: Apache Arrow at JupyterCon

Wes McKinney Sun, 03 Sep 2017 07:14:21 -0700

I think ideally we would have a Java interface that would support all of:

- Memory mapped files
- Anonymous shared memory segments (e.g. POSIX shm)
- NVM / Mnemonic


We already have the ability to do zero-copy reads from buffer-like
objects in C++ and IO interfaces that support zero copy (like memory
mapped files). We can do zero-copy reads from ArrowBuf in Java but we
are missing the interfaces to shared memory sources

- Wes

On Thu, Aug 31, 2017 at 5:09 PM, Gang(Gary) Wang <ga...@apache.org> wrote:
> Hi Wes,
>
> Thank you for the explanation. the usage of
> https://issues.apache.org/jira/browse/ARROW-721 could be directly supported
> by Mnemonic through DurableBuffer and DurableChunk, the DurableChunk makes
> use of unsafe to expose a plain memory space for Arrow to use without
> performance penalties. that's why most of the big data frameworks take the
> advantage of unsafe, please refer to
> https://mnemonic.apache.org/docs/domusecases.html for the use cases. we
> could work on this ticket if you think that's exactly what you want.
>
> Regarding the NVM tech., that is what Mnemonic created for. it could be
> used to directly persist Java generic objects and collection on NVM with no
> SerDe. so what kind of basic tools you mentioned? probably,  we can help
> also identify the gaps for Mnemonic as well. Thanks!
>
> Very truly yours,
> Gary
>
>
>
>
>
>
>
>
>
>
> On Thu, Aug 31, 2017 at 12:32 PM, Wes McKinney <wesmck...@gmail.com> wrote:
>
>> hi Gary,
>>
>> The Java libraries are not yet capable of writing or zero-copy reads
>> of Arrow datasets to/from shared memory or memory-mapped files:
>> https://issues.apache.org/jira/browse/ARROW-721. We've developed quite
>> a bit of technology on the C++ side for dealing with shared memory IPC
>> but we need someone to help with that on the Java side.
>>
>> In the context of NVM technologies, it would be nice to be able to
>> persist a dataset to NVM and continue to do analytics on it, while
>> retaining a "handle" so that the dataset can be easily recovered in
>> the event of process failure. We may arrive at new use cases once some
>> of the basic tools exist.
>>
>> - Wes
>>
>> On Wed, Aug 30, 2017 at 6:19 PM, Gang(Gary) Wang <ga...@apache.org> wrote:
>> > Thank you for sharing the videos. We are very interested in how to
>> support
>> > Arrow data format and collection very closely, could you please help to
>> > point out which interfaces to allow Mnemonic act as a memory provider for
>> > the user to store and access Arrow managed datasets ? Thanks!
>> >
>> > Very truly yours,
>> > Gary.
>> >
>> >
>> > On Wed, Aug 30, 2017 at 2:11 PM, Ivan Sadikov <ivan.sadi...@gmail.com>
>> > wrote:
>> >
>> >> Great presentation! Thank you for sharing.
>> >>
>> >>
>> >> On Thu, 31 Aug 2017 at 8:02 AM, Wes McKinney <wesmck...@gmail.com>
>> wrote:
>> >>
>> >> > Absolutely. I will do that now
>> >> >
>> >> > On Wed, Aug 30, 2017 at 3:33 PM, Julian Hyde <jh...@apache.org>
>> wrote:
>> >> > > Thanks for sharing. Can we tweet those videos as well? I see that
>> >> > https://twitter.com/apachearrow <https://twitter.com/apachearrow>
>> only
>> >> > tweeted your slides.
>> >> > >
>> >> > >> On Aug 26, 2017, at 1:11 PM, Wes McKinney <wesmck...@gmail.com>
>> >> wrote:
>> >> > >>
>> >> > >> hi all,
>> >> > >>
>> >> > >> In case folks here are interested, I gave a keynote this week at
>> >> > >> JupyterCon explaining my motivations for being involved in Apache
>> >> > >> Arrow and how I see it fitting in with the data science ecosystem
>> long
>> >> > >> term:
>> >> > >>
>> >> > >> https://www.youtube.com/watch?v=wdmf1msbtVs
>> >> > >>
>> >> > >> I also gave an interview going a little deeper into some of the
>> topics
>> >> > >> from the talk:
>> >> > >>
>> >> > >> https://www.youtube.com/watch?v=Q7y9l-L8yiU
>> >> > >>
>> >> > >> I believe we have an exciting journey ahead of us, but it's
>> certainly
>> >> > >> going to take a lot of collaboration and community development.
>> >> > >>
>> >> > >> - Wes
>> >> > >
>> >> >
>> >>
>>

Re: Apache Arrow at JupyterCon

Reply via email to