A J <s5alye <at> gmail.com> writes: > > > Makes sense ! Thanks. > Just a quick follow-up: > Now I understand the write is not made to coordinator (unless it is part of the replica for that key). But does the write column traffic 'flow' through the coordinator node. For a 2G column write, will I see 2G network traffic on the coordinator node or just a few bytes of traffic on the co-ordinator of it reading the key and talking to nodes/client etc ?
Yes, if you talk to random (AKA coordinator) node first - all 2G traffic will flow to it first and then forwarded to natural nodes (those owning replicas of a row to be written). If you want to avoid extra traffic, you should determine natural nodes of the row and send your write directly to one of natural nodes (i.e. one of natural nodes became coordinator). This natural coordinator node will accept write locally and submit write to other replicas in parallel. If your client is written in java this can be implemented relatively easy. Look at TokenMetadata.ringIterator(). If you have no requirement on using thrift interface of cassandra, it could be more efficient to write using StorageProxy interface. The latter plays a local coordinator role, so it talks directly to all replicas, so these 2G will be passed directly from your client to all row replicas. > > This will be a factor for us. So need to make sure exactly.