I have similar concerns as Todd stated below. With an mmap-based approach,
we are treating shared memory objects like files. This brings in all
filesystem related considerations like ACL and lifecycle mgmt.

Stepping back a little, the shared-memory work isn't really specific to
Arrow. A few questions related to this:
1) Has the topic been discussed in the context of protobuf (or other IPC
protocols) before? Seems Cap'n Proto (https://capnproto.org/) has zero-copy
shared memory. I haven't read implementation detail though.
2) If the shared-memory work benefits a wide range of protocols, should it
be a generalized and standalone library?

Thanks,
Zhe

On Tue, Mar 15, 2016 at 8:30 PM Todd Lipcon <t...@cloudera.com> wrote:

> Having thought about this quite a bit in the past, I think the mechanics of
> how to share memory are by far the easiest part. The much harder part is
> the resource management and ownership. Questions like:
>
> - if you are using an mmapped file in /dev/shm/, how do you make sure it
> gets cleaned up if the process crashes?
> - how do you allocate memory to it? there's nothing ensuring that /dev/shm
> doesn't swap out if you try to put too much in there, and then your
> in-memory super-fast access will basically collapse under swap thrashing
> - how do you do lifecycle management across the two processes? If, say,
> Kudu wants to pass a block of data to some Python program, how does it know
> when the Python program is done reading it and it should be deleted? What
> if the python program crashed in the middle - when can Kudu release it?
> - how do you do security? If both sides of the connection don't trust each
> other, and use length prefixes and offsets, you have to be constantly
> validating and re-validating everything you read.
>
> Another big factor is that shared memory is not, in my experience,
> immediately faster than just copying data over a unix domain socket. In
> particular, the first time you read an mmapped file, you'll end up paying
> minor page fault overhead on every page. This can be improved with
> HugePages, but huge page mmaps are not supported yet in current Linux (work
> going on currently to address this). So you're left with hugetlbfs, which
> involves static allocations and much more pain.
>
> All the above is a long way to say: let's make sure we do the write
> prototyping and up-front design before jumping into code.
>
> -Todd
>
>
>
> On Tue, Mar 15, 2016 at 5:54 PM, Jacques Nadeau <jacq...@apache.org>
> wrote:
>
> > @Corey
> > The POC Steven and Wes are working on is based on MappedBuffer but I'm
> > looking at using netty's fork of tcnative to use shared memory directly.
> >
> > @Yiannis
> > We need to have both RPC and a shared memory mechanisms (what I'm
> inclined
> > to call IPC but is a specific kind of IPC). The idea is we negotiate via
> > RPC and then if we determine shared locality, we work over shared memory
> > (preferably for both data and control). So the system interacting with
> > HBase in your example would be the one responsible for placing collocated
> > execution to take advantage of IPC.
> >
> > How do others feel of my redefinition of IPC to mean the same memory
> space
> > communication (either via shared memory or rdma) versus RPC as socket
> based
> > communication?
> >
> >
> > On Tue, Mar 15, 2016 at 5:38 PM, Corey Nolet <cjno...@gmail.com> wrote:
> >
> > > I was seeing Netty's unsafe classes being used here, not mapped byte
> > > buffer  not sure if that statement is completely correct but I'll have
> to
> > > dog through the code again to figure that out.
> > >
> > > The more I was looking at unsafe, it makes sense why that would be
> > > used.apparently it's also supposed to be included on Java 9 as a first
> > > class API
> > > On Mar 15, 2016 7:03 PM, "Wes McKinney" <w...@cloudera.com> wrote:
> > >
> > > > My understanding is that you can use java.nio.MappedByteBuffer to
> work
> > > > with memory-mapped files as one way to share memory pages between
> Java
> > > > (and non-Java) processes without copying.
> > > >
> > > > I am hoping that we can reach a POC of zero-copy Arrow memory sharing
> > > > Java-to-Java and Java-to-C++ in the near future. Indeed this will
> have
> > > > huge implications once we get it working end to end (for example,
> > > > receiving memory from a Java process in Python without a heavy ser-de
> > > > step -- it's what we've always dreamed of) and with the metadata and
> > > > shared memory control flow standardized.
> > > >
> > > > - Wes
> > > >
> > > > On Wed, Mar 9, 2016 at 9:25 PM, Corey J Nolet <cjno...@gmail.com>
> > wrote:
> > > > > If I understand correctly, Arrow is using Netty underneath which is
> > > > using Sun's Unsafe API in order to allocate direct byte buffers off
> > heap.
> > > > It is using Netty to communicate between "client" and "server",
> > > information
> > > > about memory addresses for data that is being requested.
> > > > >
> > > > > I've never attempted to use the Unsafe API to access off heap
> memory
> > > > that has been allocated in one JVM from another JVM but I'm assuming
> > this
> > > > must be the case in order to claim that the memory is being accessed
> > > > directly without being copied, correct?
> > > > >
> > > > > The implication here is huge. If the memory is being directly
> shared
> > > > across processes by them being allowed to directly reach into the
> > direct
> > > > byte buffers, that's true shared memory. Otherwise, if there's copies
> > > going
> > > > on, it's less appealing.
> > > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Sent from my iPad
> > > >
> > >
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Reply via email to