Re: Making RF4 useful aka primary and secondary ranges

Carl Mueller Wed, 14 Mar 2018 14:56:31 -0700

I also wonder if the state of hinted handoff can inform the validity of
extra replicas. Repair is mentioned in 7168.



On Wed, Mar 14, 2018 at 4:55 PM, Carl Mueller <[email protected]>
wrote:

> For my reference: https://issues.apache.org/jira/browse/CASSANDRA-7168
>
>
> On Wed, Mar 14, 2018 at 4:46 PM, Ariel Weisberg <[email protected]> wrote:
>
>> Hi,
>>
>> There is a JIRA for decoupling the size of the group size used for
>> consensus with level of data redundancy. https://issues.apache.org/jira
>> /browse/CASSANDRA-13442
>>
>> It's been discussed quite a bit offline and I did a presentation on it at
>> NGCC. Hopefully we will see some movement on it soon.
>>
>> Ariel
>>
>> On Wed, Mar 14, 2018, at 5:40 PM, Carl Mueller wrote:
>> > Currently there is little use for RF4. You're getting the requirements
>> of
>> > QUORUM-3 but only one extra backup.
>> >
>> > I'd like to propose something that would make RF4 a sort of more heavily
>> > backed up RF3.
>> >
>> > A lot of this is probably achievable with strictly driver-level logic,
>> so
>> > perhaps it would belong more there.
>> >
>> > Basically the idea is to have four replicas of the data, but only have
>> to
>> > practically do QUORUM with three nodes. We consider the first three
>> > replicas the "primary replicas". On an ongoing basis for QUORUM reads
>> and
>> > writes, we would rely on only those three replicas to satisfy
>> > two-out-of-three QUORUM. Writes are persisted to the fourth replica in
>> the
>> > normal manner of cassandra, it just doesn't count towards the QUORUM
>> write.
>> >
>> > On reads, with token and node health awareness by the driver, if the
>> > primaries are all healthy, two-of-three QUORUM is calculated from those.
>> >
>> > If however one of the three primaries is down, read QUORUM is a bit
>> > different:
>> > 1) if the first two replies come from the two remaining primaries and
>> > agree, the is returned
>> > 2) if the first two replies are a primary and the "hot spare" and those
>> > agree, that is returned
>> > 3) if the primary and hot spare disagree, wait for the next primary to
>> > return, and then take the agreement (hopefully) that results
>> >
>> > Then once the previous primary comes back online, the read quorum goes
>> back
>> > to preferring that set, with the assuming hinted handoff and repair will
>> > get it back up to snuff.
>> >
>> > There could also be some mechanism examining the hinted handoff status
>> of
>> > the four to determine when to reactivate the primary that was down.
>> >
>> > For mutations, one could prefer a "QUORUM plus" that was a quorum of the
>> > primaries plus the hot spare.
>> >
>> > Of course one could do multiple hot spares, so RF5 could still be
>> treated
>> > as RF3 + hot spares.
>> >
>> > The goal here is more data resiliency but not having to rely on as many
>> > nodes for resiliency.
>> >
>> > Since the data is ring-distributed, the fact there are primary owners of
>> > ranges should still be evenly distributed and no hot nodes should result
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>

Re: Making RF4 useful aka primary and secondary ranges

Reply via email to