Solandra does this https://github.com/tjake/Solandra/blob/solandra/src/lucandra/dht/RandomPartitioner.java
But Row Groups is going to be the "official" way. -Jake On Wed, Nov 9, 2011 at 5:53 PM, Todd Burruss <bburr...@expedia.com> wrote: > Thx jake for the JIRA, but there was someone at the conference that had > already implemented what I mentioned. It didn't offer any atomicity, just > co-locating a family of data on the same node. > > From: Jake Luciani <jak...@gmail.com> > Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> > Date: Wed, 9 Nov 2011 02:53:20 -0800 > To: "user@cassandra.apache.org" <user@cassandra.apache.org> > > Subject: Re: Second Cassandra users survey > > Hi Todd, > > Entity Groups : https://issues.apache.org/jira/browse/CASSANDRA-1684 > > -Jake > > On Wed, Nov 9, 2011 at 6:44 AM, Todd Burruss <bburr...@expedia.com> wrote: > >> I believe I heard someone talk at Cassandra SF conference about creating a >> partitioner that was a derivation of RandomPartitioner. It essentially >> would look for keys that adhere to a certain pattern, like <key>:<subkey>. >> The <key> portion would be used for determining the location on the ring, >> but <key>:<subkey> for actually storing. This would allow groups of data >> (all having the same <key>) to reside on the same node, while still >> maintaining uniqueness across the entire keyspace. >> >> Unbalanced nodes could still occur, but I don't think any worse than >> wide/large rows can cause. >> >> >> On 11/8/11 1:29 AM, "Daniel Doubleday" <daniel.double...@gmx.net> wrote: >> >> >Ah cool - thanks for the pointer! >> > >> >On Nov 7, 2011, at 5:25 PM, Ed Anuff wrote: >> > >> >> This is basically what entity groups are about - >> >> https://issues.apache.org/jira/browse/CASSANDRA-1684 >> >> >> >> On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin <wool...@gmail.com> wrote: >> >>> This feature interests me, so I thought I'd add some comments. >> >>> >> >>> Having used partition features in existing databases like DB2, Oracle >> >>> and manual partitioning, one of the biggest challenges is keeping the >> >>> partitions balanced. What I've seen with manual partitioning is that >> >>> often the partitions get unbalanced. Usually the developers take a >> >>> best guess and hope it ends up balanced. >> >>> >> >>> Some of the approaches I've used in the past were zip code, area code, >> >>> state and some kind of hash. >> >>> >> >>> So my question related deterministic sharding is this, "what rebalance >> >>> feature(s) would be useful or needed once the partitions get >> >>> unbalanced?" >> >>> >> >>> Without a decent plan for rebalancing, it often ends up being a very >> >>> painful problem to solve in production. Back when I worked mobile >> >>> apps, we saw issues with how OpenWave WAP servers partitioned the >> >>> accounts. The early versions randomly assigned a phone to a server >> >>> when it is provisioned the first time. Once the phone was associated >> >>> to that server, it was stuck on that server. If the load on that >> >>> server was heavier than the others, the only choice was to "scale up" >> >>> the hardware. >> >>> >> >>> My understanding of Cassandra's current sharding is consistent and >> >>> random. Does the new feature sit some where in-between? Are you >> >>> thinking of a pluggable API so that you can provide your own hash >> >>> algorithm for cassandra to use? >> >>> >> >>> >> >>> >> >>> On Mon, Nov 7, 2011 at 7:54 AM, Daniel Doubleday >> >>> <daniel.double...@gmx.net> wrote: >> >>>> Allow for deterministic / manual sharding of rows. >> >>>> >> >>>> Right now it seems that there is no way to force rows with different >> >>>>row keys will be stored on the same nodes in the ring. >> >>>> This is our number one reason why we get data inconsistencies when >> >>>>nodes fail. >> >>>> >> >>>> Sometimes a logical transaction requires writing rows with different >> >>>>row keys. If we could use something like this: >> >>>> >> >>>> prefix.uniquekey and let the partitioner use only the prefix the >> >>>>probability that only part of the transaction would be written could >> >>>>be reduced considerably. >> >>>> >> >>>> >> >>>> >> >>>> On Nov 1, 2011, at 11:59 PM, Jonathan Ellis wrote: >> >>>> >> >>>>> Hi all, >> >>>>> >> >>>>> Two years ago I asked for Cassandra use cases and feature requests. >> >>>>> [1] The results [2] have been extremely useful in setting and >> >>>>> prioritizing goals for Cassandra development. But with the release >> >>>>>of >> >>>>> 1.0 we've accomplished basically everything from our original wish >> >>>>> list. [3] >> >>>>> >> >>>>> I'd love to hear from modern Cassandra users again, especially if >> >>>>> you're usually a quiet lurker. What does Cassandra do well? What >> >>>>>are >> >>>>> your pain points? What's your feature wish list? >> >>>>> >> >>>>> As before, if you're in stealth mode or don't want to say anything >> in >> >>>>> public, feel free to reply to me privately and I will keep it off >> the >> >>>>> record. >> >>>>> >> >>>>> [1] >> >>>>> >> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg0114 >> >>>>>8.html >> >>>>> [2] >> >>>>> >> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg014 >> >>>>>46.html >> >>>>> [3] >> >>>>>http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html >> >>>>> >> >>>>> -- >> >>>>> Jonathan Ellis >> >>>>> Project Chair, Apache Cassandra >> >>>>> co-founder of DataStax, the source for professional Cassandra >> support >> >>>>> http://www.datastax.com >> >>>> >> >>>> >> >>> >> > >> >> > > > -- > http://twitter.com/tjake > -- http://twitter.com/tjake