On Mon, Mar 7, 2011 at 11:32 AM, John Lewis <lewili...@gmail.com> wrote: > When you say decent latency and throughput what numbers do you consider > decent? I know throughput would be highly dependent on the quantity of kb > shoved through the pipe so I would expect throughput needs would be highly > dependent on the data actually in cassandra.
As you say, throughput needed is dependent on Cassandra payload size, but also (in 0.7) read repair percentage. Cassandra is a large consumer of network traffic relative to the amount of data serviced to clients due to background repair processes like read repair and manual AES repair. There are obviously scenarios where you might saturate the WAN link given large enough nodes or numbers of nodes per datacenter.. When I am talking about latency, my experience is with WAN latency under 100ms and without DynamicEndpointSnitch. I suspect that within an order of magnitude of that latency, with or without DES, is likely to be fine for many use cases. There are a few tunables which might be appropriate to increase when operating in more than two datacenters with greater possible latency between any two as well as replication strategies and consistency levels which offer certain latency behavior. As always, simulating your actual workload is likely to give you the most relevant information as to the impact of inter-cassandra latency on your application. :) =Rob