Hello Marcus and thank you for your fast reply. Yes, we thought about that and indeed it would work. However we really have writes and reads constraints for respectively producer and consumer datacenters so we would like to keep all/most access "local".
We don't need synchronization between datacenters to be fast, we just need to know when it's done :-/ Fabrice From: Marcus Olsson [mailto:marcus.ols...@ericsson.com] Sent: mardi 2 juin 2015 13:29 To: user@cassandra.apache.org Subject: Re: Cassandra datacenters replication advanced usage Hi Fabrice, Have you considered using "each_quorum" instead of "all"? Each_quorum will require replies from a quorum of nodes from all datacenters. This could be used either: Producer using each_quorum and consumer local_quroum. (better read latencies at the cost of write latencies) or Producer using local_quorum and consumer each_quorum. (better write latencies at the cost of read latencies) BR Marcus Olsson On 06/02/2015 01:00 PM, Fabrice Douchant wrote: Hi everyone. For a project, we use a Cassandra cluster in order to have fast reads/writes on a large number of (column oriented) generated data. Until now, we only had 1 datacenter for prototyping. We now plan to split our cluster in 2 datacenters to meet performance requirements (the data transfer between both datacenter is quite slow): datacenter #1 : located near our data producer services : intensively writes all data in Cassandra periodically (each writes has a "run_id" column in its primary key) datacenter #2 : located near our data consumer services: intensively reads all data produced by datacenter #1 for a given "run _id". However, we would like our consumer services to access data only in the datacenter near them (datacenter #2) and when all data for a given "run_id" have been completely replicated from datacenter #1 (data generated by the producer services). My question is : how can we ensure that all data have been replicated in datancenter #2 before telling producer services (near datacenter #2) to start using them ? Our best solutions so far (but still not good enough :-P): producer services (datacenter #1) writes in consistency "all". But this leads to poor partitioning failure tolerance AND really bad writes performances. producer services (datacenter #1) writes in consistency "local_quorum" and a last "run finished" value could be written in consistency "all". But it seems Cassandra does not ensure replication ordering. Do you have any suggestion ? Thanks a lot, Fabrice