comments below

On Dec 9, 2013, at 11:33 PM, Aaron Morton <aa...@thelastpickle.com> wrote:

>> But this becomes troublesome if I add or remove nodes. What effectively I 
>> want is to partition on the unique id of the record modulus N (id % N; where 
>> N is the number of nodes).
> This is exactly the problem consistent hashing (used by cassandra) is 
> designed to solve. If you hash the key and modulo the number of nodes, adding 
> and removing nodes requires a lot of data to move. 
> 
>> I want to be able to randomly distribute a large set of records but keep 
>> them clustered in one wide row per node.
> Sounds like you should revisit your data modelling, this is a pretty well 
> known anti pattern. 
> 
> When rows get above a few 10’s  of MB things can slow down, when they get 
> above 50 MB they can be a pain, when they get above 100MB it’s a warning 
> sign. And when they get above 1GB, well you you don’t want to know what 
> happens then. 
> 
> It’s a bad idea and you should take another look at the data model. If you 
> have to do it, you can try the ByteOrderedPartitioner which uses the row key 
> as a token, given you total control of the row placement.


You should re-read my last paragraph as an example that would clearly benefit 
from such an approach. If one understands how paging works then you’ll see why 
how you’d benefit from grouping probabilistically similar data within each 
node, but also wanting to split data across nodes to reduce hot spotting.

Regardless, I no longer think it’s necessary to have a single wide rode per 
node. Several wide rows per node is just as good, since for all practical 
purposes paging in the first N key/values per M rows on a node is the same as 
reading in the first N*M key/values from a single row.

So I’m going to do what I alluded to before. Treat the LSB of a record’s id as 
the partition key, and then cluster on something meaningful (such as geohash) 
and the prefix of the id.

create table test (
        id_prefix int,
        id_suffix int,
        geohash text,
        value text, 
primary key (id_suffix, geohash, id_prefix));

So if this was for a collection of users, they would be randomly distributed 
across nodes to increase parallelism and reduce hotspots, but within each wide 
row they'd be meaningfully clustered by geographic region, so as to increase 
the probability that paged in data is going to contain more of the working set.



> 
> Cheers
> 
> 
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
> 
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
> 
> On 4/12/2013, at 8:32 pm, Vivek Mishra <mishra.v...@gmail.com> wrote:
> 
>> So Basically you want to create a cluster of multiple unique keys, but data 
>> which belongs to one unique should be colocated. correct?
>> 
>> -Vivek
>> 
>> 
>> On Tue, Dec 3, 2013 at 10:39 AM, onlinespending <onlinespend...@gmail.com> 
>> wrote:
>> Subject says it all. I want to be able to randomly distribute a large set of 
>> records but keep them clustered in one wide row per node.
>> 
>> As an example, lets say I’ve got a collection of about 1 million records 
>> each with a unique id. If I just go ahead and set the primary key (and 
>> therefore the partition key) as the unique id, I’ll get very good random 
>> distribution across my server cluster. However, each record will be its own 
>> row. I’d like to have each record belong to one large wide row (per server 
>> node) so I can have them sorted or clustered on some other column.
>> 
>> If I say have 5 nodes in my cluster, I could randomly assign a value of 1 - 
>> 5 at the time of creation and have the partition key set to this value. But 
>> this becomes troublesome if I add or remove nodes. What effectively I want 
>> is to partition on the unique id of the record modulus N (id % N; where N is 
>> the number of nodes).
>> 
>> I have to imagine there’s a mechanism in Cassandra to simply randomize the 
>> partitioning without even using a key (and then clustering on some column).
>> 
>> Thanks for any help.
>> 
> 

Reply via email to