On 25 February 2016 at 11:35, Todd Lipcon <t...@cloudera.com> wrote:

> One thing to keep in mind is that shared memory is not a performance
> panacea.
>
> We did some experimentation (actually, an intern on our team did --
> credit where credit is due) with shared memory transport between the
> Kudu C++ client and server. What we found was that, for single-batch
> transfers, shared memory was no faster than unix domain sockets (which
> were measurably but not a lot faster than TCP sockets). The issue is
> that, when a new process mmaps a shared memory segment, it still has
> to take minor page faults to set up page table entries on every page
> it hits.
>
> Shared memory with huge pages is one potential solution, but relies on
> either a fixed hugepage allocation (pain to configure) or hoping that
> background THP compaction is keeping up. Additionally, tmpfs-based
> shared memory doesn't support huge pages in current kernel versions
> AFAIK, so you're stuck with sysv shared memory, which is a bit more
> painful to use from Java.
>
> If you're able to reuse a shared memory region across multiple data
> transfers, the improvements are substantial, though, so it's not a
> dead end. It's just most useful for the case where you're streaming a
> bunch of *batches* of data (eg 8M each) between two processes, and can
> have some way of coordinating between them.
>

It seems like Arrow would benefit from a complementary effort to define a
(simple) streaming memory transfer protocol between processes. Although Wes
mentioned RPC earlier, I'd hope that's really a placeholder for "fast
inter-process message passing", since the overheads of RPC as usually
realised would likely be very significant, even just for control messages.

This may just mean a common way to access a ring buffer with a defined
in-memory layout and a way of signalling "buffer no longer empty". Without
doing this in a standardised way, we'd run the risk of N^2 communication
protocols between N different analytics systems.

Such a thing could also potentially be used for RDMA-based signalling that
avoids the CPU, as per this paper from MSR:
https://www.usenix.org/conference/nsdi14/technical-sessions/dragojevic



>
> Another difficulty is security -- with a format like Arrow that embeds
> internal pointers, any process following the pointers needs to either
> trust the other processes with write access to the segment, or needs
> to carefully verify any pointers before following them. Otherwise,
> it's pretty easy to crash the reader if you're a malicious writer. In
> the case of using arrow from a shared daemon (eg impalad) to access an
> untrusted UDF (eg python running in a setuid task container), this is
> a real concern IMO.
>
> -Todd
>
>
>
> On Thu, Feb 25, 2016 at 8:37 AM, Wes McKinney <w...@cloudera.com> wrote:
> > hi Leif -- you've articulated almost exactly my vision for pandas
> > interoperability with Spark via Arrow. There are some questions to sort
> > out, like shared memory / mmap management and security / sandboxing
> > questions, but in general moving toward a model where RPC's contain
> shared
> > memory offsets to Arrow data rather than shipping large chunks of
> > serialized data through stdin/stdout would be a huge leap forward.
> >
> > On Wed, Feb 24, 2016 at 4:26 PM, Leif Walsh <leif.wa...@gmail.com>
> wrote:
> >
> >> Here's the image that popped into my mind when I heard about this
> project,
> >> at best it's a motivating example, at worst it's a distraction:
> >>
> >> 1. Spark reads parquet from wherever into an arrow structure in shared
> >> memory.
> >> 2. Spark executor calls into the Python half of pyspark with a handle to
> >> this memory.
> >> 3. Pandas computes some summary over this data in place, produces some
> >> smaller result set, also in memory, also in arrow format.
> >> 4. Spark driver collects results (without ser/de overheads, just
> straight
> >> byte buffer reads across the network), concatenates them in memory.
> >> 5. Spark driver hands that memory to local pandas, which does more
> >> computation over that in place.
> >>
> >> This is what got me excited, it kills a lot of the spark overheads
> related
> >> to data (and for a kicker, also enables full pandas compatibility along
> >> with statsmodels/scikit/etc in step 3). I suppose if this is not a
> vision
> >> the arrow secs have, I'd be interested in how. Not looking for any
> promises
> >> yet though.
> >> On Wed, Feb 24, 2016 at 14:52 Corey Nolet <cjno...@apache.org> wrote:
> >>
> >> > So far, how does the integration with the Spark project look? Do you
> >> > envision cached Spark partitions allocating Arrows? I could imagine
> this
> >> > would be absolutely huge for being able to ask questions of real-time
> >> data
> >> > sets across applications.
> >> >
> >> > On Wed, Feb 24, 2016 at 2:49 PM, Zhe Zhang <z...@apache.org> wrote:
> >> >
> >> > > Thanks for the insights Jacques. Interesting to learn the thoughts
> on
> >> > > zero-copy sharing.
> >> > >
> >> > > mmap allows sharing address spaces via filesystem interface. That
> has
> >> > some
> >> > > security concerns but with the immutability restrictions (as you
> >> > clarified
> >> > > here), it sounds feasible. What other options do you have in mind?
> >> > >
> >> > > On Wed, Feb 24, 2016 at 11:30 AM Corey Nolet <cjno...@gmail.com>
> >> wrote:
> >> > >
> >> > > > Agreed,
> >> > > >
> >> > > > I thought the whole purpose was to share the memory space (using
> >> > possibly
> >> > > > unsafe operations like ByteBuffers) so that it could be directly
> >> shared
> >> > > > without copy. My interest in this is to have it enable fully
> >> in-memory
> >> > > > computation. Not just "processing" as in Spark, but as a fully
> >> > in-memory
> >> > > > datastore that one application can expose and share with others
> (e.g.
> >> > In
> >> > > > memory structure is constructed from a series of parquet files,
> >> > somehow,
> >> > > > then Spark pulls it in, does some computations, exposes a data
> set,
> >> > > > etc...).
> >> > > >
> >> > > > If you are leaving the allocation of the memory to the
> applications
> >> and
> >> > > > underneath the memory is being allocated using direct
> bytebuffers, I
> >> > > can't
> >> > > > see exactly why the problem is fundamentally hard- especially if
> the
> >> > > > applications themselves are worried about exposing their own
> memory
> >> > > spaces.
> >> > > >
> >> > > >
> >> > > > On Wed, Feb 24, 2016 at 2:17 PM, Andrew Brust <
> >> > > > andrew.br...@bluebadgeinsights.com> wrote:
> >> > > >
> >> > > > > Hmm...that's not exactly how Jaques described things to me when
> he
> >> > > > briefed
> >> > > > > me on Arrow ahead of the announcement.
> >> > > > >
> >> > > > > -----Original Message-----
> >> > > > > From: Zhe Zhang [mailto:z...@apache.org]
> >> > > > > Sent: Wednesday, February 24, 2016 2:08 PM
> >> > > > > To: dev@arrow.apache.org
> >> > > > > Subject: Re: Question about mutability
> >> > > > >
> >> > > > > I don't think one application/process's memory space will be
> made
> >> > > > > available to other applications/processes. It's fundamentally
> hard
> >> > for
> >> > > > > processes to share their address spaces.
> >> > > > >
> >> > > > > IIUC, with Arrow, when application A shares data with
> application
> >> B,
> >> > > the
> >> > > > > data is still duplicated in the memory spaces of A and B. It's
> just
> >> > > that
> >> > > > > data serialization/deserialization are much faster with Arrow
> >> > (compared
> >> > > > > with Protobuf).
> >> > > > >
> >> > > > > On Wed, Feb 24, 2016 at 10:40 AM Corey Nolet <cjno...@gmail.com
> >
> >> > > wrote:
> >> > > > >
> >> > > > > > Forgive me if this question seems ill-informed. I just started
> >> > > looking
> >> > > > > > at Arrow yesterday. I looked around the github a tad.
> >> > > > > >
> >> > > > > > Are you expecting the memory space held by one application to
> be
> >> > > > > > mutable by that application and made available to all
> >> applications
> >> > > > > > trying to read the memory space?
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> --
> >> --
> >> Cheers,
> >> Leif
> >>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Reply via email to