Sorry for the slow reply, it's been crunch time on the 1.1 freeze...

What's a good starting point to get a feel for what you've added?  Is
it PBSTracker?

Is this different conceptually from something like
https://issues.apache.org/jira/browse/CASSANDRA-1123, other than that
obviously you're specifically concerned with PBS-related metrics?

On Thu, Jan 19, 2012 at 11:59 AM, Peter Bailis <pbai...@cs.berkeley.edu> wrote:
> We recently completed research at UC Berkeley that's highly relevant to
> Cassandra and are interested in feedback from the Cassandra developer
> community. In brief, eventually consistent replication (which is often
> faster than strongly consistent replication) provides no *guarantees* about
> the recency of data returned. However, we can accurately provide
> *expectations* of data recency. Our work, which we call Probabilistically
> Bounded Staleness (PBS), helps make these predictions. Using PBS, we can
> optimize the trade-off between latency and consistency provided by partial
> quorums (R+W <= N) by predicting both with high accuracy.
>
> Currently, in Cassandra, there's no good way to predict the performance
> benefits of using partial quorums or the consistency they provide. However,
> as you're probably well-aware, Cassandra uses partial quorums (N=3, R=W=1)
> by *default*, so this work is particularly relevant to many deployments. By
> measuring the latency of messaging and using modeling techniques we've
> developed, Cassandra can do better by describing the probability of
> consistency according to both time and versions (see an interactive demo in
> your browser at http://cs.berkeley.edu/~pbailis/projects/pbs/#demo and a
> good write-up by Datastax's Paul Cannon on their blog last week:
> http://www.datastax.com/dev/blog/your-ideal-performance-consistency-tradeoff).
>  Moreover, these techniques are broadly applicable: for example, in our
> Technical Report (http://cs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-4.pdf),
> we analyze Cassandra as well as production deployments of Voldemort and
> Riak at LinkedIn and Yammer.
>
> We've developed a patch for Cassandra that performs this profiling and
> analysis and are potentially interested in working to integrate this as a
> feature in Cassandra (see code and documentation at:
> https://github.com/pbailis/cassandra-pbs).
>
> We welcome any feedback or questions you might have.
>
> Thanks!
> Peter Bailis
> UC Berkeley
>
> More info:
> You can read an overview of PBS on our project page:
> http://cs.berkeley.edu/~pbailis/projects/pbs/
> You can also read our technical report on PBS that has more technical
> detail: http://cs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-4.pdf
>
> Daniel Abadi recently blogged about the latency-consistency trade-off:
> http://dbmsmusings.blogspot.com/2011/12/replication-and-latency-consistency.html
> Henry Robinson (Cloudera) also blogged about PBS:
> http://the-paper-trail.org/blog/?p=334



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to