We recently completed research at UC Berkeley that's highly relevant to Cassandra and are interested in feedback from the Cassandra developer community. In brief, eventually consistent replication (which is often faster than strongly consistent replication) provides no *guarantees* about the recency of data returned. However, we can accurately provide *expectations* of data recency. Our work, which we call Probabilistically Bounded Staleness (PBS), helps make these predictions. Using PBS, we can optimize the trade-off between latency and consistency provided by partial quorums (R+W <= N) by predicting both with high accuracy.
Currently, in Cassandra, there's no good way to predict the performance benefits of using partial quorums or the consistency they provide. However, as you're probably well-aware, Cassandra uses partial quorums (N=3, R=W=1) by *default*, so this work is particularly relevant to many deployments. By measuring the latency of messaging and using modeling techniques we've developed, Cassandra can do better by describing the probability of consistency according to both time and versions (see an interactive demo in your browser at http://cs.berkeley.edu/~pbailis/projects/pbs/#demo and a good write-up by Datastax's Paul Cannon on their blog last week: http://www.datastax.com/dev/blog/your-ideal-performance-consistency-tradeoff). Moreover, these techniques are broadly applicable: for example, in our Technical Report (http://cs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-4.pdf), we analyze Cassandra as well as production deployments of Voldemort and Riak at LinkedIn and Yammer. We've developed a patch for Cassandra that performs this profiling and analysis and are potentially interested in working to integrate this as a feature in Cassandra (see code and documentation at: https://github.com/pbailis/cassandra-pbs). We welcome any feedback or questions you might have. Thanks! Peter Bailis UC Berkeley More info: You can read an overview of PBS on our project page: http://cs.berkeley.edu/~pbailis/projects/pbs/ You can also read our technical report on PBS that has more technical detail: http://cs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-4.pdf Daniel Abadi recently blogged about the latency-consistency trade-off: http://dbmsmusings.blogspot.com/2011/12/replication-and-latency-consistency.html Henry Robinson (Cloudera) also blogged about PBS: http://the-paper-trail.org/blog/?p=334