Thanks for your thoughts guys.

I agree that with vnodes total downtime is lessened.  Although it also
seems that the total number of outages (however small) would be greater.

But I think downtime is only lessened up to a certain cluster size.

I'm thinking that as the cluster continues to grow:
  - node rebuild time will max out (a single node only has so much write
bandwidth)
  - the probability of 2 nodes being down at any given time will continue
to increase -- even if you consider only non-correlated failures.

Therefore, when adding nodes beyond the point where node rebuild time maxes
out, both the total number of outages *and* overall downtime would increase?

Thanks,
Eric




On Mon, Dec 10, 2012 at 7:00 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote:

> Assuming you need to work with quorum in a non-vnode scenario. That means
> that if 2 nodes in a row in the ring are down some number of quorum
> operations will fail with UnavailableException (TimeoutException right
> after the failures). This is because the for a given range of tokens quorum
> will be impossible, but quorum will be possible for others.
>
> In a vnode world if any two nodes are down,  then the intersection of
> vnode token ranges they have are unavailable.
>
> I think it is two sides of the same coin.
>
>
> On Mon, Dec 10, 2012 at 7:41 AM, Richard Low <r...@acunu.com> wrote:
>
>> Hi Tyler,
>>
>> You're right, the math does assume independence which is unlikely to be
>> accurate.  But if you do have correlated failure modes e.g. same power,
>> racks, DC, etc. then you can still use Cassandra's rack-aware or DC-aware
>> features to ensure replicas are spread around so your cluster can survive
>> the correlated failure mode.  So I would expect vnodes to improve uptime in
>> all scenarios, but haven't done the math to prove it.
>>
>> Richard.
>>
>
>

Reply via email to