Some big systems using Cassandra's counters were built (such as Rainbird:
http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011)
and seem to be doing great job.

If you are concerned with performance, then maybe using memory-based store
(such as Redis) will better suit your case (as long as it fits in the
memory, but considering the data model, I guess it might work).

If you are going to stick with Cassandra, then tweaking compaction
threshold can make a visible difference on the read performance, at least
from what I have seen. You can also consider changing the PRIMARY KEY to
((uid, someid), time) - this will make the partition key out of uid+someid,
rather than just someid. Depending on the access pattern, it might help.


On Thu, Dec 5, 2013 at 4:44 PM, Christopher Wirt <chris.w...@struq.com>wrote:

> I want to build a really simple column family which counts the occurrence
> of a single event X.
>
>
>
> Once we reach Y occurrences of X the counter resets to 0
>
>
>
> The obvious way to do this is with a counter CF.
>
>
>
> CREATE TABLE xcounter1 (
>
>                 id uuid,
>
>                 someid int,
>
>                 count counter
>
> ) PRIMARY KEY (uid, someid)
>
>
>
> This is how I’ve always done it in the past, but I’ve been told to avoid
> counters for various reasons, performance, consistency etc..
>
> I’m not too bothered about 100% absolute consistency, however read
> performance is certainly a big concern.
>
>
>
> So I was thinking to avoid using counters I could do something like this.
>
>
>
> CREATE TABLE xcounter2 (
>
>                 id uuid,
>
>                 someid int,
>
>                 time timeuuid
>
> ) PRIMARY KEY (uid, someid, time)
>
>
>
> Then retrieve all events and count in memory. Delete all id, someid
> records once I hit Y.
>
>
>
> Or I could
>
> CREATE TABLE xcounter3 (
>
>                 id uuid,
>
>                 someid int,
>
>                 time timeuuid,
>
>                 Ycount int
>
> ) PRIMARY KEY (uid, someid, time)
>
>
>
> Insert a ‘Ycount’ on each occurrence of the event.
>
> Only retrieve the last Y value inserted on reading
>
> Then delete all records once I hit the magic Y value.
>
>
>
>
>
> Anyone have any interesting thoughts or insight on what is likely to give
> me the best read performance?
>
> There will be 100’s of someid to each id. Reads will be 5-10x the writes.
>
>
>
>
>
> Thanks,
>
>
>
> Chris
>

Reply via email to