Oops, I was thinking all in the same keyspace. If you made a new keyspace for each DC you could specify where to put the data and have them only be in one place.
-Jeremiah On Nov 22, 2011, at 8:49 PM, Jeremiah Jordan wrote: > Cassandra's Multiple Data Center Support is meant for replicating all data > across multiple datacenter's efficiently. > > You could use the Byte Order Partitioner to prefix data with a key and assign > those keys to nodes in specific data centers, though the edge nodes would get > tricky as those would want to have replicas in other data centers, you could > probably do some stuff with sentinel values, and some nodes that only > replicate data and aren't the primary node for any data to make this not > happen. > > It is doable, though this would probably be more trouble then it is worth. I > would probably just make each DC its own cluster and have client logic which > knows which DC to query. > > -Jeremiah > > On Nov 22, 2011, at 6:57 PM, Mathieu Lalonde wrote: > >> >> >> Hi, >> >> I am wondering if Cassandra's features and datacenter awareness can help me >> with my scalability problems. >> >> Suppose that I have a 10-20 Data centers, each with their own local >> (massive) source of time series data. I would like: >> - to avoid replication across data centers (this seems doable based on: >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Different-KeySpaces-for-different-nodes-in-the-same-ring-td5096393.html#a5096568 >> ) >> - writes for local data to be done on the local data center (not sure about >> that one) >> - reads from a master data center to any remote data centers (not sure about >> that one either) >> >> It sounds like I am trying to use Cassandra in a very different way that it >> was intended to be used. >> Should I simply have a middle-tier that takes care of distributing reads to >> multiple data centers and treat each data center as its own autonomous >> cluster? >> >> Thanks! >> Matt >> >> >