Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Raj Chudasama Sun, 18 Oct 2015 21:37:55 -0700

In this can does it make sense to remove newly added nodes, correct the 
configuration and have them rejoin one at a time ?


Thx

Sent from my iPhone

> On Oct 18, 2015, at 11:19 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote:
> 
> Take a snapshot now, before you get rid of any data (whatever you do, don’t 
> run cleanup). 
> 
> If you identify missing data, you can go back to those snapshots, find the 
> nodes that had the data previously (sstable2json, for example), and either 
> re-stream that data into the cluster with sstableloader or copy it to a new 
> host and `nodetool refresh` it into the new system.
> 
> 
> 
> From: <burtonator2...@gmail.com> on behalf of Kevin Burton
> Reply-To: "user@cassandra.apache.org"
> Date: Sunday, October 18, 2015 at 8:10 PM
> To: "user@cassandra.apache.org"
> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes at 
> once?
> 
> ouch.. OK.. I think I really shot myself in the foot here then.  This might 
> be bad.
> 
> I'm not sure if I would have missing data.  I mean basically the data is on 
> the other nodes.. but the cluster has been running with 10 nodes accidentally 
> bootstrapped with auto_bootstrap=false.  
> 
> So they have new data and seem to be missing values. 
> 
> this is somewhat misleading... Initially if you start it up and run nodetool 
> status , it only returns one node. 
> 
> So I assumed auto_bootstrap=false meant that it just doesn't join the cluster.
> 
> I'm running a nodetool repair now to hopefully fix this.
> 
> 
> 
>> On Sun, Oct 18, 2015 at 7:25 PM, Jeff Jirsa  <jeff.ji...@crowdstrike.com> 
>> wrote:
>> auto_bootstrap=false tells it to join the cluster without running bootstrap 
>> – the node assumes it has all of the necessary data, and won’t stream any 
>> missing data.
>> 
>> This generally violates consistency guarantees, but if done on a single 
>> node, is typically correctable with `nodetool repair`.
>> 
>> If you do it on many  nodes at once, it’s possible that the new nodes could 
>> represent all 3 replicas of the data, but don’t physically have any of that 
>> data, leading to missing records.
>> 
>> 
>> 
>> From: <burtonator2...@gmail.com> on behalf of Kevin Burton
>> Reply-To: "user@cassandra.apache.org"
>> Date: Sunday, October 18, 2015 at 3:44 PM
>> To: "user@cassandra.apache.org"
>> Subject: Re: Would we have data corruption if we bootstrapped 10 nodes at 
>> once?
>> 
>> An shit.. I think we're seeing corruption.. missing records :-/
>> 
>>> On Sat, Oct 17, 2015 at 10:45 AM, Kevin Burton <bur...@spinn3r.com> wrote:
>>> We just migrated from a 30 node cluster to a 45 node cluster. (so 15 new 
>>> nodes)
>>> 
>>> By default we have auto_boostrap = false
>>> 
>>> so we just push our config to the cluster, the cassandra daemons restart, 
>>> and they're not cluster members and are the only nodes in the cluster.
>>> 
>>> Anyway.  While I was about 1/2 way done adding the 15 nodes,  I had about 7 
>>> members of the cluster and 8 not yet joined.
>>> 
>>> We are only doing 1 at a time because apparently bootstrapping more than 1 
>>> is unsafe.  
>>> 
>>> I did a rolling restart whereby I went through and restarted all the 
>>> cassandra boxes.  
>>> 
>>> Somehow the new nodes auto boostrapped themselves EVEN though 
>>> auto_bootstrap=false.
>>> 
>>> We don't have any errors.  Everything seems functional.  I'm just worried 
>>> about data loss.
>>> 
>>> Thoughts?
>>> 
>>> Kevin
>>> 
>>> -- 
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations 
>>> Engineers!
>>> 
>>> Founder/CEO Spinn3r.com
>>> Location: San Francisco, CA
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> 
>> 
>> 
>> 
>> -- 
>> We’re hiring if you know of any awesome Java Devops or Linux Operations 
>> Engineers!
>> 
>> Founder/CEO Spinn3r.com
>> Location: San Francisco, CA
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
> 
> 
> 
> -- 
> We’re hiring if you know of any awesome Java Devops or Linux Operations 
> Engineers!
> 
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
>

Re: Would we have data corruption if we bootstrapped 10 nodes at once?

Reply via email to