Re: Re: Re: how to configure the Token Allocation Algorithm

Jean Carlo Mon, 29 Apr 2019 01:49:41 -0700

Hello Anthony,

Effectively I did not start the seed of every rack firsts. Thank you for
the post. I believe this is something important to have as official
documentation in cassandra.apache.org. This issues as many others are not
documented properly.


Of course I find the blog of last pickle very useful in this matters, but
having a properly documentation of how to start a fresh new cluster
cassandra is basic.

I have one question about your post, when you mention
"*However, therein lies the problem, for existing clusters updating this
setting is easy, as a keyspace already exists*"
What is the interest to use allocate_tokens_for_keyspace in a cluster with
data if there tokens are already distributed? in the worst case scenario,
the cluster is already unbalanced


Cheers

Jean Carlo

"The best way to predict the future is to invent it" Alan Kay


On Mon, Apr 29, 2019 at 2:45 AM Anthony Grasso <anthony.gra...@gmail.com>
wrote:

> Hi Jean,
>
> It sounds like there are no nodes in one of the racks for the eu-west-3
> datacenter. What does the output of nodetool status look like currently?
>
> Note, you will need to start a node in each rack before creating the
> keyspace. I wrote a blog post with the procedure to set up a new cluster
> using the predictive token allocation algorithm:
> http://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>
> Regards,
> Anthony
>
> On Fri, 26 Apr 2019 at 19:53, Jean Carlo <jean.jeancar...@gmail.com>
> wrote:
>
>> Creating a fresh new cluster in aws using this procedure, I got this
>> problem once I am bootstrapping the second rack of the cluster of 6
>> machines with 3 racks and a keyspace of rf 3
>>
>> WARN  [main] 2019-04-26 11:37:43,845 TokenAllocation.java:63 - Selected
>> tokens [-5106267594614944625, 623001446449719390, 7048665031315327212,
>> 3265006217757525070, 5054577454645148534, 314677103601736696,
>> 7660890915606146375, -5329427405842523680]
>> ERROR [main] 2019-04-26 11:37:43,860 CassandraDaemon.java:749 - Fatal
>> configuration error
>> org.apache.cassandra.exceptions.ConfigurationException: Token allocation
>> failed: the number of racks 2 in datacenter eu-west-3 is lower than its
>> replication factor 3.
>>
>> Someone got this problem ?
>>
>> I am not quite sure why I have this, since my cluster has 3 racks.
>>
>> Cluster Information:
>>     Name: test
>>     Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
>>     DynamicEndPointSnitch: enabled
>>     Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>>     Schema versions:
>>         3bf63440-fad7-3371-9c14-4855ad11ee83: [192.0.0.1, 192.0.0.2]
>>
>>
>>
>> Jean Carlo
>>
>> "The best way to predict the future is to invent it" Alan Kay
>>
>>
>> On Thu, Jan 24, 2019 at 10:32 AM Ahmed Eljami <ahmed.elj...@gmail.com>
>> wrote:
>>
>>> Hi folks,
>>>
>>> What about adding new keyspaces in the existing cluster, test_2 with
>>> the same RF.
>>>
>>> It will use the same logic as the existing kesypace test ? Or I should
>>> restart nodes and add the new keyspace to the cassandra.yaml ?
>>>
>>> Thanks.
>>>
>>> Le mar. 2 oct. 2018 à 10:28, Varun Barala <varunbaral...@gmail.com> a
>>> écrit :
>>>
>>>> Hi,
>>>>
>>>> Managing `initial_token` by yourself will give you more control over
>>>> scale-in and scale-out.
>>>> Let's say you have three node cluster with `num_token: 1`
>>>>
>>>> And your initial range looks like:-
>>>>
>>>> Datacenter: datacenter1
>>>> ==========
>>>> Address    Rack        Status State   Load            Owns
>>>>    Token
>>>>
>>>>                                  3074457345618258602
>>>>
>>>> 127.0.0.1  rack1       Up     Normal  98.96 KiB       66.67%
>>>>    -9223372036854775808
>>>> 127.0.0.2  rack1       Up     Normal  98.96 KiB       66.67%
>>>>    -3074457345618258603
>>>> 127.0.0.3  rack1       Up     Normal  98.96 KiB       66.67%
>>>>    3074457345618258602
>>>>
>>>> Now let's say you want to scale out the cluster to twice the current
>>>> throughput(means you are adding 3 more nodes)
>>>>
>>>> If you are using AWS EBS volumes then you can use the same volumes and
>>>> spin three more nodes by selecting midpoints of existing ranges which means
>>>> your new nodes are already having data.
>>>> Once you have mounted volumes on your new nodes:-
>>>> * You need to delete every system table except schema related tables.
>>>> * You need to generate system/local table by yourself which has
>>>> `Bootstrap state` as completed and schema-version same as other existing
>>>> nodes.
>>>> * You need to remove extra data on all the machines using cleanup
>>>> commands
>>>>
>>>> This is how you can scale out Cassandra cluster in the minutes. In case
>>>> you want to add nodes one by one then you need to write some small tool
>>>> which will always figure out the bigger range in the existing cluster and
>>>> will split it into the half.
>>>>
>>>> However, I never tested it thoroughly but this should work
>>>> conceptually. So here we are taking advantage of the fact that we have
>>>> volumes(data) for the new node beforehand so we no need to bootstrap them.
>>>>
>>>> Thanks & Regards,
>>>> Varun Barala
>>>>
>>>> On Tue, Oct 2, 2018 at 2:31 PM onmstester onmstester <
>>>> onmstes...@zoho.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> Sent using Zoho Mail <https://www.zoho.com/mail/>
>>>>>
>>>>>
>>>>> ---- On Mon, 01 Oct 2018 18:36:03 +0330 *Alain RODRIGUEZ
>>>>> <arodr...@gmail.com <arodr...@gmail.com>>* wrote ----
>>>>>
>>>>> Hello again :),
>>>>>
>>>>> I thought a little bit more about this question, and I was actually
>>>>> wondering if something like this would work:
>>>>>
>>>>> Imagine 3 node cluster, and create them using:
>>>>> For the 3 nodes: `num_token: 4`
>>>>> Node 1: `intial_token: -9223372036854775808, -4611686018427387905, -2,
>>>>> 4611686018427387901`
>>>>> Node 2: `intial_token: -7686143364045646507, -3074457345618258604,
>>>>> 1537228672809129299, 6148914691236517202`
>>>>> Node 3: `intial_token: -6148914691236517206, -1537228672809129303,
>>>>> 3074457345618258600, 7686143364045646503`
>>>>>
>>>>>  If you know the initial size of your cluster, you can calculate the
>>>>> total number of tokens: number of nodes * vnodes and use the
>>>>> formula/python code above to get the tokens. Then use the first token for
>>>>> the first node, move to the second node, use the second token and repeat.
>>>>> In my case there is a total of 12 tokens (3 nodes, 4 tokens each)
>>>>> ```
>>>>> >>> number_of_tokens = 12
>>>>> >>> [str(((2**64 / number_of_tokens) * i) - 2**63) for i in
>>>>> range(number_of_tokens)]
>>>>> ['-9223372036854775808', '-7686143364045646507',
>>>>> '-6148914691236517206', '-4611686018427387905', '-3074457345618258604',
>>>>> '-1537228672809129303', '-2', '1537228672809129299', 
>>>>> '3074457345618258600',
>>>>> '4611686018427387901', '6148914691236517202', '7686143364045646503']
>>>>> ```
>>>>>
>>>>>
>>>>> Using manual initial_token (your idea), how could i add a new node to
>>>>> a long running cluster (the procedure)?
>>>>>
>>>>>
>>>
>>> --
>>> Cordialement;
>>>
>>> Ahmed ELJAMI
>>>
>>

Re: Re: Re: how to configure the Token Allocation Algorithm

Reply via email to