Interesting idea with the counter row approach.  I think it puts a dubious
responsibility on the Cassandra user.  Sure, Cassandra users are expected to
maintain the size of a row, but asking Cassandra users to constantly
aggregate counts of uuids in a situation where the rows are growing rapidly
to maintain a counter seems out of the realm of the average Cassandra end
user.

My napkin math may be slightly off, but if a "counter row aggregator"
stopped functioning, crashed, or didn't do it's job correctly on a counter
receiving 2,000 increments per second, you end up with a single row at
>2.57GB after 24 hours (2,000/sec x 86,400 seconds x 16 bytes per uuid).
 This is approaches the magnitude of memory on a single node and would seem
(to me?) to significantly impact load and load distribution.  Maybe there is
a way Cassandra could perform the counter row aggregation internally (with
read repair?) and offer it to end users as a clean, simple, intuitive
interface.

I have never thought counters were something Cassandra handles well.  If
there is not a satisfactory way to integrate counter into the Cassandra
internals, I think it'd be great for somebody in-the-know to provide
in-depth and detailed documentation on best practices for how to implement
counters.  I think distributed and scalable counters can be a killer app for
Cassandra, and circumventing locking systems such as ZooKeeper is key.

Disclaimer: I'm not quite a Cassandra developer, more of an Ops guy and
user, just trying to add perspective.  I do not want a pony.

-Ben Standefer


On Thu, Aug 12, 2010 at 8:54 PM, Jonathan Ellis <jbel...@gmail.com> wrote:

> There are two concerns that give me pause.
>
> The first is that 1072 is tackling a use case that Cassandra already
> handles well: high volume of writes to a counter, with low volume
> reads.  (This can be done by inserting uuids into a counter row, and
> aggregating them either in the background or at read time or with some
> combination of these.  The counter rows can be sharded if necessary.)
>
> The second is that the approach in 1072 resembles an entirely separate
> system that happens to use part of Cassandra infrastructure -- the
> thrift API, the MessagingService, the sstable format -- but isn't
> really part of it.  ConsistencyLevel is not respected, and special
> cases abound to weld things in that don't fit, e.g. the AES/Streaming
> business.
>
> On Thu, Aug 12, 2010 at 1:28 AM, Robin Bowes <robin-li...@robinbowes.com>
> wrote:
> > Hi Jonathan,
> >
> > I'm contacting you in your capacity as project lead for the cassandra
> > project. I am wondering how close ticket #1072 is to implementation [1]
> >
> > We are about to do a proof of concept with cassandra to replace around
> > 20 MySQL partitions (1 partition = 4 machines: master/slave in DC A,
> > master/slave in DC B).
> >
> > We're essentially just counting web hits - around 10k/second at peak
> > times - so increment counters is pretty much essential functionality for
> us.
> >
> > How close is the patch in #1072 to being acceptable? What is blocking it?
> >
> > Thanks,
> >
> > R.
> >
> > [1] https://issues.apache.org/jira/browse/CASSANDRA-1072
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Reply via email to