?????? ?????? tolerate how many nodes down in the cluster

Peng Xiao Wed, 26 Jul 2017 19:39:43 -0700

Kurt/All,


why the  # of racks should be equal to RF?

For example,we have 2 DCs each 6 machines with RF=3,each machine virtualized to 
8 vms ,
can we set 6 racs with RF3? I mean one machine one RAC to avoid hardware errors 
or only set 3 racs,1 rac with 2 machines,which is better?


Thanks








------------------ ???????? ------------------
??????: "Anuj Wadehra";<anujw_2...@yahoo.co.in.INVALID>;
????????: 2017??7??27??(??????) ????1:41
??????: "Brooke Thorley"<bro...@instaclustr.com>; 
"user@cassandra.apache.org"<user@cassandra.apache.org>; 
????: "Peng Xiao"<2535...@qq.com>; 
????: Re: ?????? tolerate how many nodes down in the cluster



 Hi Brooke,


 Very nice presentation: https://www.youtube.com/watch?v=QrP7G1eeQTI !! 
 Good to know that you are able to leverage Racks for gaining operational 
efficiencies. I think vnodes have made life easier. 
 

I still see some concerns with Racks:

 
 1. Usually scaling needs are driven by business requirements. Customers want 
value for every penny they spend. Adding 3 or 5 servers (because you have RF=3 
or 5) instead of 1 server costs them dearly. It's difficult to justify the 
additional cost as fault tolerance can only be improved but not guaranteed with 
racks.

 
2. You need to maintain mappings of Logical Racks (=RF) and physical racks 
(multiple of RFs) for large clusters. 
 
3.  Using racks tightly couples your hardware (rack size, rack count) / 
virtualization decisions (VM Size, VM count per physical node) with application 
RF.
 
Thanks
 Anuj
 
 


    On Tuesday, 25 July 2017 3:56 AM, Brooke Thorley <bro...@instaclustr.com> 
wrote:

  

 Hello Peng. 

I think spending the time to set up your nodes into racks is worth it for the 
benefits that it brings. With RF3 and NTS you can tolerate the loss of a whole 
rack of nodes without losing QUORUM as each rack will contain a full set of 
data.  It makes ongoing cluster maintenance easier, as you can perform 
upgrades, repairs and restarts on a whole rack of nodes at once.  Setting up 
racks or adding nodes is not difficult particularly if you are using vnodes.  
You would simply add nodes in multiples of <num racks> to keep the racks 
balanced.  This is how we run all our managed clusters and it works very well.


You may be interested to watch my Cassandra Summit presentation from last year 
in which I discussed this very topic: 
https://www.youtube.com/watch?v=QrP7G1eeQTI (from 4:00)



If you were to consider changing your rack topology, I would recommend that you 
do this by DC migration rather than "in place". 



Kind Regards,Brooke Thorley
VP Technical Operations & Customer Services
supp...@instaclustr.com | support.instaclustr.com

    
Read our latest technical blog posts here.
This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and 
Instaclustr Inc (USA).
This email and any attachments may contain confidential and legally privileged 
information.  If you are not the intended recipient, do not copy or disclose 
its content, but please reply to this email immediately and highlight the error 
to the sender and then immediately delete the message.













 
On 25 July 2017 at 03:06, Anuj Wadehra <anujw_2...@yahoo.co.in.invalid> wrote:
Hi Peng, 

Three things are important when you are evaluating fault tolerance and 
availability for your cluster:


1. RF
2. CL
3. Topology -  how data is replicated in racks. 


If you assume that N  nodes from ANY rack may fail at the same time,  then you 
can afford failure of RF-CL nodes and still be 100% available.  E. g.  If you 
are reading at quorum and RF=3, you can only afford one (3-2) node failure. 
Thus, even if you have a 30 node cluster,  10 node failure can not provide you 
100% availability. RF impacts availability rather than total number of nodes in 
a cluster. 


If you assume that N nodes failing together will ALWAYS be from the same rack,  
you can spread your servers in RF physical racks and use 
NetworkTopologyStrategy. While allocating replicas for any data, Cassandra will 
ensure that 3 replicas are placed in 3 different racks E.g. you can have 10 
nodes in 3 racks and then even a 10 node failure within SAME rack shall ensure 
that you have 100% availability as two replicas are there for 100% data and 
CL=QUORUM can be met. I have not tested this but that how the rack concept is 
expected to work.  I agree, using racks generally makes operations tougher.




Thanks
Anuj




 
   On Mon, 24 Jul 2017 at 20:10, Peng Xiao
<2535...@qq.com> wrote:
 
  Hi Bhuvan,
From the following link,it doesn't suggest us to use RAC and it looks 
reasonable.
http://www.datastax.com/dev/ blog/multi-datacenter- replication



Defining one rack for the entire cluster is the simplest and most common 
implementation. Multiple racks should be avoided for the following reasons:
        ?6?1    Most users tend to ignore or forget rack requirements that 
state racks should be in an alternating order to allow the data to get 
distributed safely and appropriately.
        ?6?1    Many users are not using the rack information effectively by 
using a setup with as many racks as they have nodes, or similar non-beneficial 
scenarios.
        ?6?1    When using racks correctly, each rack should typically have the 
same number of nodes.
        ?6?1    In a scenario that requires a cluster expansion while using 
racks, the expansion procedure can be tedious since it typically involves 
several node moves and has has to ensure to ensure that racks will be 
distributing data correctly and evenly. At times when clusters need immediate 
expansion, racks should be the last things to worry about.












------------------ ???????? ------ ------------
??????: "Bhuvan Rawal";<bhu1ra...@gmail.com>;
????????: 2017??7??24??(??????) ????7:17
??????: "user"<user@cassandra. apache.org>; 

????: Re: tolerate how many nodes down in the cluster



Hi Peng ,

This really depends on how you have configured your topology. Say if you have 
segregated your dc into 3 racks with 10 servers each. With RF of 3 you can 
safely assume your data to be available if one rack goes down. 


But if different servers amongst the racks fail then i guess you are not 
guaranteeing data integrity with RF of 3 in that case you can at max lose 2 
servers to be available. Best idea would be to plan failover modes 
appropriately and letting cassandra know of the same.


Regards,
Bhuvan


On Mon, Jul 24, 2017 at 3:28 PM, Peng Xiao <2535...@qq.com> wrote:
Hi,


Suppose we have a 30 nodes cluster in one DC with RF=3,
how many nodes can be down?can we tolerate 10 nodes down?
it seems that we are not able to avoid  the data distribution 3 replicas in the 
10 nodes?,
then we can only tolerate 1 node down even we have 30 nodes?
Could anyone please advise?


Thanks

?????? ?????? tolerate how many nodes down in the cluster

Reply via email to