We have deployed multi-center but got performance issue. When the nodes on other center are up, the read response time from clients is 4 or 5 times higher. when we take those nodes down, the response time becomes normal(compare to the time before we changed to multi-center).
We have high volume on the cluster, the consistency level is one for read. so my understanding is most of traffic between data center should be read repair. but seems that could not create much delay. What could cause the problem? how to debug this? Here is the keyspace, [default@dsat] describe dsat; Keyspace: dsat: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [dc2:1, dc1:3] Column Families: ColumnFamily: categorization_cache Ring Datacenter: dc1 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN xx.xx.xx..111 59.2 GB 256 37.5% 4d6ed8d6-870d-4963-8844-08268607757e rac1 DN xx.xx.xx..121 99.63 GB 256 37.5% 9d0d56ce-baf6-4440-a233-ad6f1d564602 rac1 UN xx.xx.xx..120 66.32 GB 256 37.5% 0fd912fb-3187-462b-8c8a-7d223751b649 rac1 UN xx.xx.xx..118 63.61 GB 256 37.5% 3c6e6862-ab14-4a8c-9593-49631645349d rac1 UN xx.xx.xx..117 68.16 GB 256 37.5% ee6cdf23-d5e4-4998-a2db-f6c0ce41035a rac1 UN xx.xx.xx..116 32.41 GB 256 37.5% f783eeef-1c51-4f91-ab7c-a60669816770 rac1 UN xx.xx.xx..115 64.24 GB 256 37.5% e75105fb-b330-4f40-aa4f-8e6e11838e37 rac1 UN xx.xx.xx..112 61.32 GB 256 37.5% 2547ee54-88dd-4994-a1ad-d9ba367ed11f rac1 Datacenter: dc2 =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack DN xx.xx.xx.199 58.39 GB 256 50.0% 6954754a-e9df-4b3c-aca7-146b938515d8 rac1 DN xx.xx.xx..61 33.79 GB 256 50.0% 91b8d510-966a-4f2d-a666-d7edbe986a1c rac1 Thank you in advance, Daning