Ah cool - thanks for the pointer! On Nov 7, 2011, at 5:25 PM, Ed Anuff wrote:
> This is basically what entity groups are about - > https://issues.apache.org/jira/browse/CASSANDRA-1684 > > On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin <wool...@gmail.com> wrote: >> This feature interests me, so I thought I'd add some comments. >> >> Having used partition features in existing databases like DB2, Oracle >> and manual partitioning, one of the biggest challenges is keeping the >> partitions balanced. What I've seen with manual partitioning is that >> often the partitions get unbalanced. Usually the developers take a >> best guess and hope it ends up balanced. >> >> Some of the approaches I've used in the past were zip code, area code, >> state and some kind of hash. >> >> So my question related deterministic sharding is this, "what rebalance >> feature(s) would be useful or needed once the partitions get >> unbalanced?" >> >> Without a decent plan for rebalancing, it often ends up being a very >> painful problem to solve in production. Back when I worked mobile >> apps, we saw issues with how OpenWave WAP servers partitioned the >> accounts. The early versions randomly assigned a phone to a server >> when it is provisioned the first time. Once the phone was associated >> to that server, it was stuck on that server. If the load on that >> server was heavier than the others, the only choice was to "scale up" >> the hardware. >> >> My understanding of Cassandra's current sharding is consistent and >> random. Does the new feature sit some where in-between? Are you >> thinking of a pluggable API so that you can provide your own hash >> algorithm for cassandra to use? >> >> >> >> On Mon, Nov 7, 2011 at 7:54 AM, Daniel Doubleday >> <daniel.double...@gmx.net> wrote: >>> Allow for deterministic / manual sharding of rows. >>> >>> Right now it seems that there is no way to force rows with different row >>> keys will be stored on the same nodes in the ring. >>> This is our number one reason why we get data inconsistencies when nodes >>> fail. >>> >>> Sometimes a logical transaction requires writing rows with different row >>> keys. If we could use something like this: >>> >>> prefix.uniquekey and let the partitioner use only the prefix the >>> probability that only part of the transaction would be written could be >>> reduced considerably. >>> >>> >>> >>> On Nov 1, 2011, at 11:59 PM, Jonathan Ellis wrote: >>> >>>> Hi all, >>>> >>>> Two years ago I asked for Cassandra use cases and feature requests. >>>> [1] The results [2] have been extremely useful in setting and >>>> prioritizing goals for Cassandra development. But with the release of >>>> 1.0 we've accomplished basically everything from our original wish >>>> list. [3] >>>> >>>> I'd love to hear from modern Cassandra users again, especially if >>>> you're usually a quiet lurker. What does Cassandra do well? What are >>>> your pain points? What's your feature wish list? >>>> >>>> As before, if you're in stealth mode or don't want to say anything in >>>> public, feel free to reply to me privately and I will keep it off the >>>> record. >>>> >>>> [1] >>>> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html >>>> [2] >>>> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html >>>> [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html >>>> >>>> -- >>>> Jonathan Ellis >>>> Project Chair, Apache Cassandra >>>> co-founder of DataStax, the source for professional Cassandra support >>>> http://www.datastax.com >>> >>> >>