You'd really want https://issues.apache.org/jira/browse/CASSANDRA-2369 to control per-row. Let me know if you'd like to help tackle that.
On Tue, Apr 5, 2011 at 5:05 PM, Yudong Gao <st...@umich.edu> wrote: > > Hi, > > I am thinking about using Cassandra for our research project, and we > are thinking about one interesting feature. > > Our setup has multiple datacenters located in different geography > locations. Data is accessed with predictable patterns. Think of > something like Craigslist, data objects corresponding to CA will > mostly accessed by users from the west cost. If this case, if all the > replicas are stored in the east coast, the access would not be > efficient. Other applications such as Facebook, should also have > similar concern. > > I am aware of the placement strategies such as > RackAwareStrategy/NetworkTopologyStrategy. But they place objects > based on their hashed token, but not they access pattern. I am > thinking about one possible trick, which is to manipulate the key of > the object based on its access pattern, so that the key can be mapped > to a token that will have at least one replica (ideally the primary > replica) stored in the desired data center, and the other replicas > stored in other data centers for reliability concern. > > I found this post discussing a similar problem, > > http://www.mail-archive.com/user@cassandra.apache.org/msg00695.html > > but Ben suggested just writing one new replication strategy. IMO, this > location-aware replication should be one common problem for Cassandra, > especially since it has been widely used in many large-scale > commercial applications such as Facebook and Twitter. I am interested > in how they handle this problem. > > Is there any existing solution that I refer to and get start with? > > Thanks! > > Yudong > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com