Hello, I've found a combination that doesn't work: A column family that have a secondary index and caching='ALL' with data in two datacenters and I do a restart of the nodes, then my secondary index queries start returning 0 rows. It happens when amount of data goes over a certain threshold, so I suspect that compactions are involved in this as well. Taking out one of the ingredients fixes the problem and my queries return rows from secondary index. I suspect that this guy is struggling with the same thing https://issues.apache.org/jira/browse/CASSANDRA-4785
Here is a sequence of actions that reproduces it with help of CCM: $ ccm create --cassandra-version 1.2.1 --nodes 2 -p RandomPartitioner testRowCacheDC $ ccm updateconf 'endpoint_snitch: PropertyFileSnitch' $ ccm updateconf 'row_cache_size_in_mb: 200' $ cp ~/Downloads/cassandra-topology.properties ~/.ccm/testRowCacheDC/node1/conf/ (please find .properties file below) $ cp ~/Downloads/cassandra-topology.properties ~/.ccm/testRowCacheDC/node2/conf/ $ ccm start $ ccm cli ->create keyspace and column family(please find schema below) $ python populate_rowcache.py $ ccm stop (I tried flush first, doesn't help) $ ccm start $ ccm cli Connected to: "testRowCacheDC" on 127.0.0.1/9160 Welcome to Cassandra CLI version 1.2.1-SNAPSHOT Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] use testks; Authenticated to keyspace: testks [default@testks] get cf1 where 'indexedColumn'='userId_75'; 0 Row Returned. Elapsed time: 68 msec(s). My cassandra instances run with -Xms1927M -Xmx1927M -Xmn400M Thanks for help. Best regards, Alexei ------ START cassandra-topology.properties ---------- 127.0.0.1=DC1:RAC1 127.0.0.2=DC2:RAC1 default=DC1:r1 ------ FINISH cassandra-topology.properties ---------- ------ START cassandra-cli schema ----------- create keyspace testks with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 1, DC1 : 1} and durable_writes = true; use testks; create column family cf1 with column_type = 'Standard' and comparator = 'org.apache.cassandra.db.marshal.AsciiType' and default_validation_class = 'UTF8Type' and key_validation_class = 'UTF8Type' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'ALL' and column_metadata = [ {column_name : 'indexedColumn', validation_class : UTF8Type, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; -------FINISH cassandra-cli schema ----------- ------ START populate_rowcache.py ----------- from pycassa.batch import Mutator import pycassa pool = pycassa.ConnectionPool('testks', timeout=5) cf = pycassa.ColumnFamily(pool, 'cf1') for userId in xrange(0, 1000): print userId b = Mutator(pool, queue_size=200) for itemId in xrange(20): rowKey = 'userId_%s:itemId_%s'%(userId, itemId) for message_number in xrange(10): b.insert(cf, rowKey, {'indexedColumn': 'userId_%s'%userId, str(message_number): str(message_number)}) b.send() pool.dispose() ------ FINISH populate_rowcache.py -----------