You'd really want https://issues.apache.org/jira/browse/CASSANDRA-2369
to control per-row. Let me know if you'd like to help tackle that.

On Tue, Apr 5, 2011 at 5:05 PM, Yudong Gao <st...@umich.edu> wrote:
>
> Hi,
>
> I am thinking about using Cassandra for our research project, and we
> are thinking about one interesting feature.
>
> Our setup has multiple datacenters located in different geography
> locations. Data is accessed with predictable patterns. Think of
> something like Craigslist, data objects corresponding to CA will
> mostly accessed by users from the west cost. If this case, if all the
> replicas are stored in the east coast, the access would not be
> efficient. Other applications such as Facebook, should also have
> similar concern.
>
> I am aware of the placement strategies such as
> RackAwareStrategy/NetworkTopologyStrategy. But they place objects
> based on their hashed token, but not they access pattern. I am
> thinking about one possible trick, which is to manipulate the key of
> the object based on its access pattern, so that the key can be mapped
> to a token that will have at least one replica (ideally the primary
> replica) stored in the desired data center, and the other replicas
> stored in other data centers for reliability concern.
>
> I found this post discussing a similar problem,
>
> http://www.mail-archive.com/user@cassandra.apache.org/msg00695.html
>
> but Ben suggested just writing one new replication strategy. IMO, this
> location-aware replication should be one common problem for Cassandra,
> especially since it has been widely used in many large-scale
> commercial applications such as Facebook and Twitter. I am interested
> in how they handle this problem.
>
> Is there any existing solution that I refer to and get start with?
>
> Thanks!
>
> Yudong
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to