Hi Ken,

thanks. Good point. 

Markus
Ken Hancock <ken.hanc...@schange.com> schrieb am 15:15 Dienstag, 15.April 2014:
 
Keep in mind if you lose the wrong two, you can't satisfy quorum.  In a 5-node 
cluster with RF=3, it would be impossible to lose 2 nodes without affecting 
quorum for at least some of your data. In a 6 node cluster, once you've lost 
one node, if you were to lose another, you only have a 1-in-5 chance of not 
affecting quorum for some of your data.
>
>In much larger clusters, it becomes less probable that you will lose multiple 
>nodes within a RF group.
>
>
>
>
>
>
>
>
>On Tue, Apr 15, 2014 at 4:37 AM, Markus Jais <markus.j...@yahoo.de> wrote:
>
>Hi all,
>>
>>
>>thanks for your answers. Very helpful. We plan to use enough nodes so that 
>>the failure of 1 or 2 machines is no problem. E.g. for a workload to can be 
>>handled by 3 nodes all the time, we would use at least 5, better 6 nodes to 
>>survive the failure of at least 2 nodes, even when the 2 nodes fail at the 
>>same time. This should allow the cluster to rebuild the missing nodes and 
>>still serve all requests with a RF=3 and Quorum reads.
>>
>>
>>All the best,
>>
>>
>>Markus
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>Tupshin Harper <tups...@tupshin.com> schrieb am 21:23 Montag, 14.April 2014:
>> 
>>tl;dr make sure you have enough capacity in the event of node failure. For 
>>light workloads, that can be fulfilled with nodes=rf. 
>>>-Tupshin
>>>On Apr 14, 2014 2:35 PM, "Robert Coli" <rc...@eventbrite.com> wrote:
>>>
>>>On Mon, Apr 14, 2014 at 2:25 AM, Markus Jais <markus.j...@yahoo.de> wrote:
>>>>
>>>>"It is generally not recommended to set a replication factor of 3 if you 
>>>>have fewer than six nodes in a data center".
>>>>
>>>>
>>>>I have a detailed post about this somewhere in the archives of this list 
>>>>(which I can't seem to find right now..) but briefly, the "6-for-3" advice 
>>>>relates to the percentage of capacity you have remaining when you have a 
>>>>node down. It has become slightly less accurate over time because vnodes 
>>>>reduce bootstrap time and there have been other improvements to node 
>>>>startup time.
>>>>
>>>>
>>>>If you have fewer than 6 nodes with RF=3, you lose >1/6th of capacity when 
>>>>you lose a single node, which is a significant percentage of total cluster 
>>>>capacity. You then lose another meaningful percentage of your capacity when 
>>>>your existing nodes participate in rebuilding the missing node. If you are 
>>>>then unlucky enough to lose another node, you are missing a very 
>>>>significant percentage of your cluster capacity and have to use a 
>>>>relatively small fraction of it to rebuild the now two down nodes.
>>>>
>>>>
>>>>I wouldn't generalize the rule of thumb as "don't run under N=RF*2", but 
>>>>rather as "probably don't run RF=3 under about 6 nodes". IOW, in my view, 
>>>>the most operationally sane initial number of nodes for RF=3 is likely 
>>>>closer to 6 than 3.
>>>>
>>>>
>>>>=Rob
>>>>
>>>>
>>>
>>>
>
>
>
> 
> 
>
> 
>
> 
>
>

Reply via email to