I think I am seeing the same issue, but it doesn't seem to be related to the schema_columns. I understand that repair is supposed to be intensive, but this is bringing the associated machine to its knees, to the point that logging on the machine takes a very, very long time and requests are no longer served (load avg ~2000.0). Is this normal? Is this a symptom of the machine not compacting enough during normal operation (minor compactions)? Thoughts?
Cassandra 1.1.5, 12 node application cluster connected to a smaller analytics cluster The analytics cluster was repairing, but it seemed to swamp one of the nodes on 12-node cluster. Java 1.6_u5, CentOS 40 - 80 GB on each node (too much?) Main CF has 4 indexes standard configs, no multithreaded compaction, 16 mb compaction throughput Speaking of the throughput, reading the cassandra.yaml file made me think that the throughput is not set correctly, but I'm not sure how to calculate the ideal value. Should I only consider the actual data size inserted, or should I use a single-node load figure / uptime_seconds as a guess (assuming constant load)? Thanks, Bryan Re: http://www.mail-archive.com/user@cassandra.apache.org/msg25561.html