Re: CASSANDRA-10993 Approaches

Stefan Podkowinski Thu, 18 Aug 2016 13:45:42 -0700

>From my perspective, one of the most important reasons for RxJava would be
the strategic option to integrate reactive streams [1] in the overall
Cassandra architecture at some point in the future. Reactive streams would
allow to design back pressure fundamentally different compared to what we
do in the current work-queue based execution model. Think about the
optimizations currently deployed to walk a thin line between throughput,
latency and GC pressure. About the lack of coordination between individual
processes such as compactions, streaming and client requests that will
effect each other; where we can just hope that clients back off due to
latency aware policies, streams that will eventually timeout, or
compactions that hopefully get enough work done at some point. We even have
to tell people to tune batch sizes to not overwhelm nodes in the cluster.
Squeezing out n% during performance tests is nice, but IMO 10993 should
also address how to get more control on using system resources and a
reactive stream based approach could help with that.


[1] https://github.com/ReactiveX/RxJava/wiki/Reactive-Streams


On Wed, Aug 17, 2016 at 9:54 PM, Jake Luciani <jak...@gmail.com> wrote:

> I think I outlined the tradeoffs I see between the roll our own vs use a
> reactive framework in
> https://issues.apache.org/jira/plugins/servlet/mobile#
> issue/CASSANDRA-10528
>
> My view is we should try to utilize the existing before we start writing
> our own. And even if we do write our own keep it reactive since reactive
> APIs are going to be adopted in the Java 9 spec.  There is an entire
> community out there thinking about asynchronous programming that we can tap
> into.
>
> I don't buy the argument (yet) that Rx or other libraries lack the control
> we need. In fact these APIs are quite extensible.
>
> On Aug 17, 2016 3:08 PM, "Tyler Hobbs" <ty...@datastax.com> wrote:
>
> > In the spirit of the recent thread about discussing large changes on the
> > Dev ML, I'd like to talk about CASSANDRA-10993, the first step in the
> > "thread per core" work.
> >
> > The goal of 10993 is to transform the read and write paths into an
> > event-driven model powered by event loops.  This means that each request
> > can be handled on a single thread (although typically broken up into
> > multiple steps, depending on I/O and locking) and the old mutation and
> read
> > thread pools can be removed.  So far, we've prototyped this with a couple
> > of approaches:
> >
> > The first approach models each request as a state machine (or composition
> > of state machines).  For example, a single write request is encapsulated
> in
> > a WriteTask object which moves through a series of states as portions of
> > the write complete (allocating a commitlog segment, syncing the
> commitlog,
> > receiving responses from remote replicas).  These state transitions are
> > triggered by Events that are emitted by, e.g., the
> > CommitlogSegmentManager.  The event loop that manages tasks, events,
> > timeouts, and scheduling is custom and is (currently) closely tied to a
> > Netty event loop.  Here are a couple of example classes to take a look
> at:
> >
> > WriteTask:
> > https://github.com/thobbs/cassandra/blob/CASSANDRA-
> > 10993-WIP/src/java/org/apache/cassandra/poc/WriteTask.java
> > EventLoop:
> > https://github.com/thobbs/cassandra/blob/CASSANDRA-
> > 10993-WIP/src/java/org/apache/cassandra/poc/EventLoop.java
> >
> > The second approach utilizes RxJava and the Observable pattern.  Where we
> > would wait for emitted events in the state machine approach, we instead
> > depend on an Observable to "push" the data/result we're awaiting.
> > Scheduling is handled by an Rx scheduler (which is customizable).  The
> code
> > changes required for this are, overall, less intrusive.  Here's a quick
> > example of what this looks like for high-level operations:
> > https://github.com/thobbs/cassandra/blob/rxjava-rebase/
> > src/java/org/apache/cassandra/service/StorageProxy.java#L1724-L1732
> > .
> >
> > So far we've benchmarked both approaches on in-memory reads to get an
> idea
> > of the upper-bound performance of both approaches.  Throughput appears to
> > be very similar with both branches.
> >
> > There are a few considerations up for debate as to which approach we
> should
> > go with that I would appreciate input on.
> >
> > First, performance.  There are concerns that going with Rx (or something
> > similar) may limit the peak performance we can eventually attain in a
> > couple of ways.  First, we don't have as much control over the event
> loop,
> > scheduling, and chunking of tasks.  With the state machine approach,
> we're
> > writing all of this, so it's totally under our control.  With Rx, a lot
> of
> > things are customizable or already have decent tools, but this may come
> up
> > short in critical ways.  Second, the overhead of the Observable machinery
> > may become significant as other bottlenecks are removed.  Of course,
> > WriteTask et al have their own overhead, but once again, we have more
> > control there.
> >
> > The second consideration is code style and ease of understanding.  I
> think
> > both of these approaches have downsides in different areas.  The state
> > machines are very explicit (an upside), but also very verbose and
> somewhat
> > disjointed.  Most of the complex operations in Cassandra can't cleanly be
> > represented as a single state machine, because they're logically multiple
> > state machines operating in parallel (e.g. the local write path and the
> > remote write path in WriteTask).  After working on the prototypes, I've
> > found the state machines to be harder to logically follow than I had
> > hoped.  Perhaps we could come up with better abstractions and patterns
> for
> > this, but that's the current state of things.  On the Rx side, the
> downside
> > is that the behavior is much less explicit.  Additionally, some find it
> > more difficult to mentally follow the flow of execution.  Based on my
> past
> > work with a large Twisted Python codebase, I'll agree that it's tough to
> > get used to, but not unmanageable with experience and good coding
> patterns.
> >
> > A third consideration is code reuse.  A big advantage of Rx is that it
> > comes with many tools for transforming Observables, handling multiple
> > Observables, error handling, and tracing.  With the state machine
> approach,
> > we would need to write equivalents for these from scratch.  This is a
> > non-trivial amount of work that might make the project take significantly
> > longer to complete.  Combining this with fact that the Rx approach would
> be
> > less invasive, it seems like we would have an easier time introducing
> > incremental changes to the code base rather than having a big-bang
> commit.
> >
> > If I can boil these concerns down to one tradeoff, it's this: do we want
> to
> > expend more effort and have more explicit code and complete control, or
> do
> > we want to piggyback on the Rx work, give up some control, and
> (hopefully)
> > get to the next, deeper optimizations sooner?
> >
> > Thanks for any input on this topic.
> >
> >
> > --
> > Tyler Hobbs
> > DataStax <http://datastax.com/>
> >
>

Re: CASSANDRA-10993 Approaches

Reply via email to