Hi, I am not sure if this is a bug or we use the counter the wrong way, but I keep getting a enormous counter number in our deployment. After a few tries, I am finally able to reproduce it. The following are the settings of my development: ----------------------------------------------------- I have two-node cluster with the following keyspace and column family settings.
Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 63fda700-c243-11e0-0000-2d03dcafebdf: [172.17.19.151, 172.17.19.152] Keyspace: test: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [datacenter1:2] Column Families: ColumnFamily: testCounter (Super) "APP status information." Key Validation Class: org.apache.cassandra.db.marshal.BytesType Default column value validator: org.apache.cassandra.db.marshal.CounterColumnType Columns sorted by: org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 200000.0/14400 Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] Then, I use a test program based on hector to add a counter column (testCounter[sc][column]) 1000 times. In the middle the adding process, I intentional shut down the node 172.17.19.152. In addition to that, the test program is smart enough to switch the consistency level from Quorum to One, so that the following adding actions would not fail. After all the adding actions are done, I start the cassandra on 172.17.19.152, and I use cassandra-cli to check if the counter is correct on both nodes, and I got a result 1001 which should be reasonable because hector will retry once. However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. Then, I check the counter again, this time I got a result 481387 which is so wrong. I was wondering if anyone could explain why this happens, is this a bug or do I use the counter the wrong way?. Regards Boris