It should be possible to have an ArrowBuf backed by a MappedByteBuffer. Anyone reading is welcome to dig in and write a patch for this.
Semantically this is what we have done in C++ -- a memory map inherits from arrow::Buffer, so we can slice and dice a memory map as we would any other Buffer object https://github.com/apache/arrow/blob/master/cpp/src/arrow/io/file.cc#L501 On Mon, Sep 4, 2017 at 4:05 AM, Gonzalo Ortiz Jaureguizar <golthir...@gmail.com> wrote: > This is a very interesting feature. It's very surprising that there is no > ByteBuffer implementation backed on a MappedByteBuffer. As far as I > understand, it should be trivial to implement (maybe not to pool) as > usually ByteBuf is backed on a ByteBuffer and MappedByteBuffer extends > that. But I didn't find implementations when I goggled for it. > > 2017-09-03 16:12 GMT+02:00 Wes McKinney <wesmck...@gmail.com>: > >> I think ideally we would have a Java interface that would support all of: >> >> - Memory mapped files >> - Anonymous shared memory segments (e.g. POSIX shm) >> - NVM / Mnemonic >> >> We already have the ability to do zero-copy reads from buffer-like >> objects in C++ and IO interfaces that support zero copy (like memory >> mapped files). We can do zero-copy reads from ArrowBuf in Java but we >> are missing the interfaces to shared memory sources >> >> - Wes >> >> On Thu, Aug 31, 2017 at 5:09 PM, Gang(Gary) Wang <ga...@apache.org> wrote: >> > Hi Wes, >> > >> > Thank you for the explanation. the usage of >> > https://issues.apache.org/jira/browse/ARROW-721 could be directly >> supported >> > by Mnemonic through DurableBuffer and DurableChunk, the DurableChunk >> makes >> > use of unsafe to expose a plain memory space for Arrow to use without >> > performance penalties. that's why most of the big data frameworks take >> the >> > advantage of unsafe, please refer to >> > https://mnemonic.apache.org/docs/domusecases.html for the use cases. we >> > could work on this ticket if you think that's exactly what you want. >> > >> > Regarding the NVM tech., that is what Mnemonic created for. it could be >> > used to directly persist Java generic objects and collection on NVM with >> no >> > SerDe. so what kind of basic tools you mentioned? probably, we can help >> > also identify the gaps for Mnemonic as well. Thanks! >> > >> > Very truly yours, >> > Gary >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > On Thu, Aug 31, 2017 at 12:32 PM, Wes McKinney <wesmck...@gmail.com> >> wrote: >> > >> >> hi Gary, >> >> >> >> The Java libraries are not yet capable of writing or zero-copy reads >> >> of Arrow datasets to/from shared memory or memory-mapped files: >> >> https://issues.apache.org/jira/browse/ARROW-721. We've developed quite >> >> a bit of technology on the C++ side for dealing with shared memory IPC >> >> but we need someone to help with that on the Java side. >> >> >> >> In the context of NVM technologies, it would be nice to be able to >> >> persist a dataset to NVM and continue to do analytics on it, while >> >> retaining a "handle" so that the dataset can be easily recovered in >> >> the event of process failure. We may arrive at new use cases once some >> >> of the basic tools exist. >> >> >> >> - Wes >> >> >> >> On Wed, Aug 30, 2017 at 6:19 PM, Gang(Gary) Wang <ga...@apache.org> >> wrote: >> >> > Thank you for sharing the videos. We are very interested in how to >> >> support >> >> > Arrow data format and collection very closely, could you please help >> to >> >> > point out which interfaces to allow Mnemonic act as a memory provider >> for >> >> > the user to store and access Arrow managed datasets ? Thanks! >> >> > >> >> > Very truly yours, >> >> > Gary. >> >> > >> >> > >> >> > On Wed, Aug 30, 2017 at 2:11 PM, Ivan Sadikov <ivan.sadi...@gmail.com >> > >> >> > wrote: >> >> > >> >> >> Great presentation! Thank you for sharing. >> >> >> >> >> >> >> >> >> On Thu, 31 Aug 2017 at 8:02 AM, Wes McKinney <wesmck...@gmail.com> >> >> wrote: >> >> >> >> >> >> > Absolutely. I will do that now >> >> >> > >> >> >> > On Wed, Aug 30, 2017 at 3:33 PM, Julian Hyde <jh...@apache.org> >> >> wrote: >> >> >> > > Thanks for sharing. Can we tweet those videos as well? I see that >> >> >> > https://twitter.com/apachearrow <https://twitter.com/apachearrow> >> >> only >> >> >> > tweeted your slides. >> >> >> > > >> >> >> > >> On Aug 26, 2017, at 1:11 PM, Wes McKinney <wesmck...@gmail.com> >> >> >> wrote: >> >> >> > >> >> >> >> > >> hi all, >> >> >> > >> >> >> >> > >> In case folks here are interested, I gave a keynote this week at >> >> >> > >> JupyterCon explaining my motivations for being involved in >> Apache >> >> >> > >> Arrow and how I see it fitting in with the data science >> ecosystem >> >> long >> >> >> > >> term: >> >> >> > >> >> >> >> > >> https://www.youtube.com/watch?v=wdmf1msbtVs >> >> >> > >> >> >> >> > >> I also gave an interview going a little deeper into some of the >> >> topics >> >> >> > >> from the talk: >> >> >> > >> >> >> >> > >> https://www.youtube.com/watch?v=Q7y9l-L8yiU >> >> >> > >> >> >> >> > >> I believe we have an exciting journey ahead of us, but it's >> >> certainly >> >> >> > >> going to take a lot of collaboration and community development. >> >> >> > >> >> >> >> > >> - Wes >> >> >> > > >> >> >> > >> >> >> >> >> >>