So Basically you want to create a cluster of multiple unique keys, but data which belongs to one unique should be colocated. correct?
-Vivek On Tue, Dec 3, 2013 at 10:39 AM, onlinespending <onlinespend...@gmail.com>wrote: > Subject says it all. I want to be able to randomly distribute a large set > of records but keep them clustered in one wide row per node. > > As an example, lets say I’ve got a collection of about 1 million records > each with a unique id. If I just go ahead and set the primary key (and > therefore the partition key) as the unique id, I’ll get very good random > distribution across my server cluster. However, each record will be its own > row. I’d like to have each record belong to one large wide row (per server > node) so I can have them sorted or clustered on some other column. > > If I say have 5 nodes in my cluster, I could randomly assign a value of 1 > - 5 at the time of creation and have the partition key set to this value. > But this becomes troublesome if I add or remove nodes. What effectively I > want is to partition on the unique id of the record modulus N (id % N; > where N is the number of nodes). > > I have to imagine there’s a mechanism in Cassandra to simply randomize the > partitioning without even using a key (and then clustering on some column). > > Thanks for any help.