Interesting idea with the counter row approach. I think it puts a dubious responsibility on the Cassandra user. Sure, Cassandra users are expected to maintain the size of a row, but asking Cassandra users to constantly aggregate counts of uuids in a situation where the rows are growing rapidly to maintain a counter seems out of the realm of the average Cassandra end user.
My napkin math may be slightly off, but if a "counter row aggregator" stopped functioning, crashed, or didn't do it's job correctly on a counter receiving 2,000 increments per second, you end up with a single row at >2.57GB after 24 hours (2,000/sec x 86,400 seconds x 16 bytes per uuid). This is approaches the magnitude of memory on a single node and would seem (to me?) to significantly impact load and load distribution. Maybe there is a way Cassandra could perform the counter row aggregation internally (with read repair?) and offer it to end users as a clean, simple, intuitive interface. I have never thought counters were something Cassandra handles well. If there is not a satisfactory way to integrate counter into the Cassandra internals, I think it'd be great for somebody in-the-know to provide in-depth and detailed documentation on best practices for how to implement counters. I think distributed and scalable counters can be a killer app for Cassandra, and circumventing locking systems such as ZooKeeper is key. Disclaimer: I'm not quite a Cassandra developer, more of an Ops guy and user, just trying to add perspective. I do not want a pony. -Ben Standefer On Thu, Aug 12, 2010 at 8:54 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > There are two concerns that give me pause. > > The first is that 1072 is tackling a use case that Cassandra already > handles well: high volume of writes to a counter, with low volume > reads. (This can be done by inserting uuids into a counter row, and > aggregating them either in the background or at read time or with some > combination of these. The counter rows can be sharded if necessary.) > > The second is that the approach in 1072 resembles an entirely separate > system that happens to use part of Cassandra infrastructure -- the > thrift API, the MessagingService, the sstable format -- but isn't > really part of it. ConsistencyLevel is not respected, and special > cases abound to weld things in that don't fit, e.g. the AES/Streaming > business. > > On Thu, Aug 12, 2010 at 1:28 AM, Robin Bowes <robin-li...@robinbowes.com> > wrote: > > Hi Jonathan, > > > > I'm contacting you in your capacity as project lead for the cassandra > > project. I am wondering how close ticket #1072 is to implementation [1] > > > > We are about to do a proof of concept with cassandra to replace around > > 20 MySQL partitions (1 partition = 4 machines: master/slave in DC A, > > master/slave in DC B). > > > > We're essentially just counting web hits - around 10k/second at peak > > times - so increment counters is pretty much essential functionality for > us. > > > > How close is the patch in #1072 to being acceptable? What is blocking it? > > > > Thanks, > > > > R. > > > > [1] https://issues.apache.org/jira/browse/CASSANDRA-1072 > > > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >