Seems accurate to me. One small correction - the daemon in HBase that serves regions is known as a "region server" rather than a region master. The RS is the equivalent of the tablet server in Bigtable terminology.
-Todd On Mon, Nov 22, 2010 at 4:50 PM, David Jeske <dav...@gmail.com> wrote: > This is my second attempt at a summary of Cassandra vs HBase consistency > and performance for an hbase acceptable workload. I think these tricky > subtlties are hard to understand, yet it's helpful for the community to > understand them. I'm not trying to state my own facts (or opinion) but > merely summarize what I've read. > > Again, please correct any facts which are wrong. Thanks for the kind and > thoughtful responses! > > *1) Cassandra can't replicate the consistency situation of HBase.* Namely > that once a write is finished that new value will either always appear or > never appear. > > [In Cassandra]Provided at least one node receives the write, it will > eventually be written to all replicas. A failure to meet the requested > ConsistencyLevel is just that; not a failure to write the data itself. Once > the write is received by a node, it will eventually reach all replicas, > there is no roll back. - Nick Telford > [ref<http://www.mail-archive.com/user@cassandra.apache.org/msg07398.html> > ] > > In Cassandra (N3/W3/R1, N3/W2/R2, or N3/W3/R3), a write can occur to a > single node, fail to meet the write-consistency request, readback can show > the old value, but later show the new value once the write that did occur is > propagated. > > [In HBase]Once a region master accepts a write, it has been flushed to the > HDFS log. If the replica server goes down while writing, if the write was > finished to any copies of the HDFS log, the new region master will accept > and propagate the write, if not, the write will never appear. > > *2) Cassandra has a less efficient use of memory, particularly for data > pinned in memory. *With 3 replicas on Cassandra, each element of data > pinned in-memory is kept on 3 servers, wheras in hbase only region masters > keep the data in memory, so there is only one-copy of each data element. > > CASSANDRA-1314 <https://issues.apache.org/jira/browse/CASSANDRA-1314>provides > an opportunity to allow a 'soft master', where reads prefer a > particular replica. Combined with a disable of read-repair this should allow > for more efficient memory usage for data pinned or cached in memory. #1 is > still true, namely that a write may only occur to a node which is not the > soft-master, and that new new value may not appear for a while and then > eventually appear. However, with N3/W3/R1, once a write appears at the > soft-master it will remain, so as long as the soft-master preference can be > honored it will be closer to HBase's consistency. > > *3) HBase can't match the row-availability situation of Cassandra > (N3/W2/R2).* In the face of a single machine failure, if it is a region > master, those keys are offline in HBase until a new region master is elected > and brought online. In Cassandra, no single node failure causes the data to > become unavailable. > > *4) Two Cassandra configurations are closest to the **consistency > situation of hbase, and provide slightly different node failure > characteristics.* (note, #1 above means Cassandra can't truly reach the > same consistency situation as HBase) > > In Cassandra (N3/W3/R1), a node failure will disallow writes to a keyrange > during the replica rebuild, while still allowing reads. > In Cassandra (N3/W2-3/R2), a node failure will allow both reads and writes > to continue, while requiring uncached reads to contact two servers. > (Requiring a response from two servers may increase common case latency, but > may hide latency from GC spikes, since any two of the three may respond) > In HBase, if an HDFS node fails, both reads and writes continue; while when > a region-master fails, both reads and writes are stalled until the region > master is replaced. > > > Was that a better summary? Is it closer to correct? > > > > > >