So Basically you want to create a cluster of multiple unique keys, but data
which belongs to one unique should be colocated. correct?

-Vivek


On Tue, Dec 3, 2013 at 10:39 AM, onlinespending <onlinespend...@gmail.com>wrote:

> Subject says it all. I want to be able to randomly distribute a large set
> of records but keep them clustered in one wide row per node.
>
> As an example, lets say I’ve got a collection of about 1 million records
> each with a unique id. If I just go ahead and set the primary key (and
> therefore the partition key) as the unique id, I’ll get very good random
> distribution across my server cluster. However, each record will be its own
> row. I’d like to have each record belong to one large wide row (per server
> node) so I can have them sorted or clustered on some other column.
>
> If I say have 5 nodes in my cluster, I could randomly assign a value of 1
> - 5 at the time of creation and have the partition key set to this value.
> But this becomes troublesome if I add or remove nodes. What effectively I
> want is to partition on the unique id of the record modulus N (id % N;
> where N is the number of nodes).
>
> I have to imagine there’s a mechanism in Cassandra to simply randomize the
> partitioning without even using a key (and then clustering on some column).
>
> Thanks for any help.

Reply via email to