Re: Second Cassandra users survey

Jake Luciani Wed, 09 Nov 2011 10:07:14 -0800

Solandra does this
https://github.com/tjake/Solandra/blob/solandra/src/lucandra/dht/RandomPartitioner.java


But Row Groups is going to be the "official" way.

-Jake

On Wed, Nov 9, 2011 at 5:53 PM, Todd Burruss <bburr...@expedia.com> wrote:

> Thx jake for the JIRA, but there was someone at the conference that had
> already implemented what I mentioned.  It didn't offer any atomicity, just
> co-locating a family of data on the same node.
>
> From: Jake Luciani <jak...@gmail.com>
> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Date: Wed, 9 Nov 2011 02:53:20 -0800
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>
> Subject: Re: Second Cassandra users survey
>
> Hi Todd,
>
> Entity Groups : https://issues.apache.org/jira/browse/CASSANDRA-1684
>
> -Jake
>
> On Wed, Nov 9, 2011 at 6:44 AM, Todd Burruss <bburr...@expedia.com> wrote:
>
>> I believe I heard someone talk at Cassandra SF conference about creating a
>> partitioner that was a derivation of RandomPartitioner.  It essentially
>> would look for keys that adhere to a certain pattern, like <key>:<subkey>.
>>  The <key> portion would be used for determining the location on the ring,
>> but <key>:<subkey> for actually storing.  This would allow groups of data
>> (all having the same <key>) to reside on the same node, while still
>> maintaining uniqueness across the entire keyspace.
>>
>> Unbalanced nodes could still occur, but I don't think any worse than
>> wide/large rows can cause.
>>
>>
>> On 11/8/11 1:29 AM, "Daniel Doubleday" <daniel.double...@gmx.net> wrote:
>>
>> >Ah cool - thanks for the pointer!
>> >
>> >On Nov 7, 2011, at 5:25 PM, Ed Anuff wrote:
>> >
>> >> This is basically what entity groups are about -
>> >> https://issues.apache.org/jira/browse/CASSANDRA-1684
>> >>
>> >> On Mon, Nov 7, 2011 at 5:26 AM, Peter Lin <wool...@gmail.com> wrote:
>> >>> This feature interests me, so I thought I'd add some comments.
>> >>>
>> >>> Having used partition features in existing databases like DB2, Oracle
>> >>> and manual partitioning, one of the biggest challenges is keeping the
>> >>> partitions balanced. What I've seen with manual partitioning is that
>> >>> often the partitions get unbalanced. Usually the developers take a
>> >>> best guess and hope it ends up balanced.
>> >>>
>> >>> Some of the approaches I've used in the past were zip code, area code,
>> >>> state and some kind of hash.
>> >>>
>> >>> So my question related deterministic sharding is this, "what rebalance
>> >>> feature(s) would be useful or needed once the partitions get
>> >>> unbalanced?"
>> >>>
>> >>> Without a decent plan for rebalancing, it often ends up being a very
>> >>> painful problem to solve in production. Back when I worked mobile
>> >>> apps, we saw issues with how OpenWave WAP servers partitioned the
>> >>> accounts. The early versions randomly assigned a phone to a server
>> >>> when it is provisioned the first time. Once the phone was associated
>> >>> to that server, it was stuck on that server. If the load on that
>> >>> server was heavier than the others, the only choice was to "scale up"
>> >>> the hardware.
>> >>>
>> >>> My understanding of Cassandra's current sharding is consistent and
>> >>> random. Does the new feature sit some where in-between? Are you
>> >>> thinking of a pluggable API so that you can provide your own hash
>> >>> algorithm for cassandra to use?
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Nov 7, 2011 at 7:54 AM, Daniel Doubleday
>> >>> <daniel.double...@gmx.net> wrote:
>> >>>> Allow for deterministic / manual sharding of rows.
>> >>>>
>> >>>> Right now it seems that there is no way to force rows with different
>> >>>>row keys will be stored on the same nodes in the ring.
>> >>>> This is our number one reason why we get data inconsistencies when
>> >>>>nodes fail.
>> >>>>
>> >>>> Sometimes a logical transaction requires writing rows with different
>> >>>>row keys. If we could use something like this:
>> >>>>
>> >>>> prefix.uniquekey and let the partitioner use only the prefix the
>> >>>>probability that only part of the transaction would be written could
>> >>>>be reduced considerably.
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Nov 1, 2011, at 11:59 PM, Jonathan Ellis wrote:
>> >>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> Two years ago I asked for Cassandra use cases and feature requests.
>> >>>>> [1]  The results [2] have been extremely useful in setting and
>> >>>>> prioritizing goals for Cassandra development.  But with the release
>> >>>>>of
>> >>>>> 1.0 we've accomplished basically everything from our original wish
>> >>>>> list. [3]
>> >>>>>
>> >>>>> I'd love to hear from modern Cassandra users again, especially if
>> >>>>> you're usually a quiet lurker.  What does Cassandra do well?  What
>> >>>>>are
>> >>>>> your pain points?  What's your feature wish list?
>> >>>>>
>> >>>>> As before, if you're in stealth mode or don't want to say anything
>> in
>> >>>>> public, feel free to reply to me privately and I will keep it off
>> the
>> >>>>> record.
>> >>>>>
>> >>>>> [1]
>> >>>>>
>> http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg0114
>> >>>>>8.html
>> >>>>> [2]
>> >>>>>
>> http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg014
>> >>>>>46.html
>> >>>>> [3]
>> >>>>>http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html
>> >>>>>
>> >>>>> --
>> >>>>> Jonathan Ellis
>> >>>>> Project Chair, Apache Cassandra
>> >>>>> co-founder of DataStax, the source for professional Cassandra
>> support
>> >>>>> http://www.datastax.com
>> >>>>
>> >>>>
>> >>>
>> >
>>
>>
>
>
> --
> http://twitter.com/tjake
>



-- 
http://twitter.com/tjake

Re: Second Cassandra users survey

Reply via email to