Agreed, I thought the whole purpose was to share the memory space (using possibly unsafe operations like ByteBuffers) so that it could be directly shared without copy. My interest in this is to have it enable fully in-memory computation. Not just "processing" as in Spark, but as a fully in-memory datastore that one application can expose and share with others (e.g. In memory structure is constructed from a series of parquet files, somehow, then Spark pulls it in, does some computations, exposes a data set, etc...).
If you are leaving the allocation of the memory to the applications and underneath the memory is being allocated using direct bytebuffers, I can't see exactly why the problem is fundamentally hard- especially if the applications themselves are worried about exposing their own memory spaces. On Wed, Feb 24, 2016 at 2:17 PM, Andrew Brust < andrew.br...@bluebadgeinsights.com> wrote: > Hmm...that's not exactly how Jaques described things to me when he briefed > me on Arrow ahead of the announcement. > > -----Original Message----- > From: Zhe Zhang [mailto:z...@apache.org] > Sent: Wednesday, February 24, 2016 2:08 PM > To: dev@arrow.apache.org > Subject: Re: Question about mutability > > I don't think one application/process's memory space will be made > available to other applications/processes. It's fundamentally hard for > processes to share their address spaces. > > IIUC, with Arrow, when application A shares data with application B, the > data is still duplicated in the memory spaces of A and B. It's just that > data serialization/deserialization are much faster with Arrow (compared > with Protobuf). > > On Wed, Feb 24, 2016 at 10:40 AM Corey Nolet <cjno...@gmail.com> wrote: > > > Forgive me if this question seems ill-informed. I just started looking > > at Arrow yesterday. I looked around the github a tad. > > > > Are you expecting the memory space held by one application to be > > mutable by that application and made available to all applications > > trying to read the memory space? > > >