I believe the two solutions that are being referred to is the "lift and shift" 
vs. upgrading by replacing a node and letting it restore from the cluster.

I don't think there are any more "risks" per-say on the upgrading by replacing, 
as long as you can make sure your new node is configured properly.  One might 
choose to do lift-and-shift in order to have a node down for less time 
(depending on your individual situation), or to have less of an impact on the 
cluster, as replacing a node would result in other nodes streaming their data 
to the newly replaced node.  Depending on your dataset, this could take quite 
some time.

All this also assumes, of course, that you are replicating your data such that 
the new node can retrieve the information it is responsible for from the other 
nodes.

Thanks,
-Mike


On Apr 21, 2013, at 4:18 PM, aaron morton wrote:

> Sorry i do not understand you question. What are the two solutions ? 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 20/04/2013, at 3:43 AM, Kais Ahmed <k...@neteck-fr.com> wrote:
> 
>> Hello and thank you for your answers.
>> 
>> The first solution is much easier for me because I use the vnode.
>> 
>> What is the risk of the first solution
>> 
>> thank you,
>> 
>> 
>> 2013/4/18 aaron morton <aa...@thelastpickle.com>
>> This is roughly the lift and shift process I use. 
>> 
>> Note that disabling thrift and gossip does not stop an existing repair 
>> session. So I often drain and then shutdown, and copy the live data dir 
>> rather than a snapshot dir. 
>> 
>> Cheers
>>  
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 19/04/2013, at 4:10 AM, Michael Theroux <mthero...@yahoo.com> wrote:
>> 
>>> This should work.  
>>> 
>>> Another option is to follow a process similar to what we recently did.  We 
>>> recently and successfully upgraded 12 instances from large to xlarge 
>>> instances in AWS.  I chose not to replace nodes as restoring data from the 
>>> ring would have taken significant time and put the cluster under some 
>>> additional load.  I also wanted to eliminate the possibility that any 
>>> issues on the new nodes could be blamed on new configuration/operating 
>>> system differences.  Instead we followed the following procedure (removing 
>>> some details that would likely be unique to our infrastructure).
>>> 
>>> For a node being upgraded:
>>> 
>>> 1) nodetool disable thrift 
>>> 2) nodetool disable gossip
>>> 3) Snapshot the data (nodetool snapshot ...)
>>> 4) Backup the snapshot data to EBS (assuming you are on ephemeral)
>>> 5) Stop cassandra
>>> 6) Move the cassandra.yaml configuration file to cassandra.yaml.bak (to 
>>> prevent any future restarts to cause cassandra to restart)
>>> 7) Shutdown the instance
>>> 8) Take an AMI of the instance
>>> 9) Start a new instance from the AMI with the desired hardware
>>> 10) If you assign the new instance a new IP Address, make sure any entries 
>>> in /etc/hosts, or the broadcast_address in cassandra.yaml is updated
>>> 11) Attach the volume you backed up your snapshot data to to the new 
>>> instance and mount it
>>> 12) Restore the snapshot data
>>> 13) Restore cassandra.yaml file
>>> 13) Restart cassandra
>>> 
>>> - I recommend practicing this on a test cluster first
>>> - As you replace nodes with new IP Addresses, eventually all your seeds 
>>> will need be updated.  This is not a big deal until all your seed nodes 
>>> have been replaced.
>>> - Don't forget about NTP!  Make sure it is running on all your new nodes.  
>>> Myself, to be extra careful, I actually deleted the ntp drift file and let 
>>> NTP recalculate it because its a new instance, and it took over an hour to 
>>> restore our snapshot data... but that may have been overkill.
>>> - If you have the opportunity, depending on your situation, increase the 
>>> max_hint_window_in_ms
>>> - Your details may vary
>>> 
>>> Thanks,
>>> -Mike
>>> 
>>> On Apr 18, 2013, at 11:07 AM, Alain RODRIGUEZ wrote:
>>> 
>>>> I would say add your 3 servers to the 3 tokens where you want them, let's 
>>>> say :
>>>> 
>>>> {
>>>>     "0": {
>>>>         "0": 0,
>>>>         "1": 56713727820156410577229101238628035242,
>>>>         "2": 113427455640312821154458202477256070485
>>>>     }
>>>> }
>>>> 
>>>> or these token -1 or +1 if you already have these token used. And then 
>>>> just decommission x1Large nodes. You should be good to go.
>>>> 
>>>> 
>>>> 
>>>> 2013/4/18 Kais Ahmed <k...@neteck-fr.com>
>>>> Hi,
>>>> 
>>>> What is the best pratice to move from a cluster of 7 nodes (m1.xlarge) to 
>>>> 3 nodes (hi1.4xlarge).
>>>> 
>>>> Thanks,
>>>> 
>>> 
>> 
>> 
> 

Reply via email to