On Sun, Jul 24, 2011 at 3:36 PM, aaron morton <aa...@thelastpickle.com> wrote: > What's your use case ? There are people out there having good times with > counters, see > > http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011 > http://www.scribd.com/doc/59830692/Cassandra-at-Twitter
It's actually pretty similar to Twitter's click counting, but apparently we have different requirements for accuracy. It's possible Rainbird does something on the front end to solve for this issue- I'm honestly not sure since they haven't released the code yet. Anyways, when you're building network aggregate graphs and fail to add the +100G of traffic from one switch to your site or metro aggregate, people around here notice. And people quickly start distrusting graphs which don't look "real" and either ignore them completely or complain. Obviously, one should manage their Cassandra cluster to limit the occurrence of Timeouts, but frankly I don't want to be paged at 2am to "fix" these kind of problems. If I knew "timeout" meant "failed to increment counter", I could spool my changes on the client and try again later, but that's not what timeout means. Without any means to recover I've actually lost a lot of reliability that I currently have with my single PostgreSQL server backed data store. Right now I'm trying to come up with a way that my distributed snmp pollers can build aggregates efficiently without counters, but that's going to add a lot of complexity. :( -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"