Thanks for the reply, I realize my question was rather nebulous as I consider this proposed deployment to be rather nebulous as well. Any bit of information and a direction on which sections of documentation are relevant helps this challenge become less nebulous over time. I will do some reading on the topics you have provided.
Thanks again Lewis On Mar 7, 2011, at 11:52 PM, Robert Coli wrote: > On Mon, Mar 7, 2011 at 11:32 AM, John Lewis <lewili...@gmail.com> wrote: >> When you say decent latency and throughput what numbers do you consider >> decent? I know throughput would be highly dependent on the quantity of kb >> shoved through the pipe so I would expect throughput needs would be highly >> dependent on the data actually in cassandra. > > As you say, throughput needed is dependent on Cassandra payload size, > but also (in 0.7) read repair percentage. Cassandra is a large > consumer of network traffic relative to the amount of data serviced to > clients due to background repair processes like read repair and manual > AES repair. There are obviously scenarios where you might saturate the > WAN link given large enough nodes or numbers of nodes per datacenter.. > > When I am talking about latency, my experience is with WAN latency > under 100ms and without DynamicEndpointSnitch. I suspect that within > an order of magnitude of that latency, with or without DES, is likely > to be fine for many use cases. There are a few tunables which might > be appropriate to increase when operating in more than two datacenters > with greater possible latency between any two as well as replication > strategies and consistency levels which offer certain latency > behavior. As always, simulating your actual workload is likely to give > you the most relevant information as to the impact of inter-cassandra > latency on your application. :) > > =Rob