Hi Christian,
The problem you mention (violation of constency) is a true one. If I have 
understood correctly, it is resolved in cassandra 2.1 (see 
CASSANDRA-2434).
Regards,
Samuel


horschi <hors...@gmail.com> a écrit sur 10/09/2015 12:41:41 :

> De : horschi <hors...@gmail.com>
> A : user@cassandra.apache.org, 
> Date : 10/09/2015 12:42
> Objet : Re: Is it possible to bootstrap the 1st node of a new DC?
> 
> Hi Rob,
> 
> regarding 1-3:
> Thank you for the step-by-step explanation :-) My mistake was to use
> join_ring=false during the inital start already. It now works for me
> as its supposed to. Nevertheless it does not what I want, as it does
> not take writes during the time of repair/rebuild: Running an 8 hour
> repair will lead to 8 hours of data missing.
> 
> regarding 1-6:
> This is what we did. And it works of course. Our issue was just that
> we had some global-QUORUMS hidden somewhere, which the operator was 
> not aware of. Therefore it would have been nice if the ops guy could
> prevent these reads by himself.
> 
> 
> Another issue I think the current bootstrapping process has: Doesn't
> it practically reduce the RF for old data by one? (With old data I 
> mean any data that was written before the bootstrap).
> 
> Let me give an example:
> 
> Lets assume I have a cluster of Node 1,2 and 3 with RF=3. And lets 
> assume a single write on node 2 got lost. So this particular write 
> is only available on node 1 and 3.
> 
> Now I add node 4, which takes the range in such a way that node 1 
> will not own that previously written key any more. Also assume that 
> the new node loads its data from node 2.
> 
> This means we have a cluster where the previously mentioned write is
> only on node 3. (Node 1 is not responsible for the key any more and 
> node 4 loaded its data from the wrong node)
> 
> Any quorum-read that hit node 2 & 4 will not return the column. So 
> this means we effectively lowered the CL/RF.
> 
> Therefore what I would like to be able to do is:
> - Add new node 4, but leave it in a joining state. (This means it 
> gets all the writes but does not serve reads.)
> - Do "nodetool rebuild"
> - New node should not serve reads yet. And node 1 should not yet 
> give up its ranges to node 4.
> - Do "nodetool repair", to ensure consistency.
> - Finish bootstrap. Now node1 should not be responsible for the 
> range and node4 should become eligible for reads.
> 
> regards,
> Christian
> 
> On Tue, Sep 8, 2015 at 11:51 PM, Robert Coli <rc...@eventbrite.com> 
wrote:
> On Tue, Sep 8, 2015 at 2:39 PM, horschi <hors...@gmail.com> wrote:
> I tried to set up a new node with join_ring=false once. In my test 
> that node did not pick a token in the ring. I assume running repair 
> or rebuild would not do anything in that case: No tokens = no data. 
> But I must admit: I have not tried running rebuild.
> 
> I admit I haven't been following this thread closely, perhaps I have
> missed what exactly it is you're trying to do.
> 
> It's possible you'd need to :
> 
> 1) join the node with auto_bootstrap=false
> 2) immediately stop it
> 3) re-start it with join_ring=false
> 
> To actually use repair or rebuild in this way.
> 
> However, if your goal is to create a new data-center and rebuild a 
> node there without any risk of reading from that node while creating
> the new data center, you can just :
> 
> 1) create nodes in new data-center, with RF=0 for that DC
> 2) change RF in that DC
> 3) run rebuild on new data-center nodes
> 4) while doing so, don't talk to new data-center coordinators from your 
client
> 5) and also use LOCAL_ONE/LOCAL_QUORUM to avoid cross-data-center 
> reads from your client
> 6) modulo the handful of current bugs which make 5) currently imperfect
> 
> What problem are you encountering with this procedure? If it's this ...
> 
> I've learned from experience that the node immediately joins the 
> cluster, and starts accepting reads (from other DCs) for the range it 
owns.
> 
> This seems to be the incorrect assumption at the heart of the 
> confusion. You "should" be able to prevent this behavior entirely 
> via correct use of ConsistencyLevel and client configuration.
> 
> In an ideal world, I'd write a detailed blog post explaining this...
> :/ in my copious spare time...
> 
> =Rob
>  

Reply via email to