We implement the patch internally, and deploy to our production clusters,
we see 2X drop of the P99 quorum read latency, because we can reduce one
unnecessary cross region read. This is a huge improvement since performance
is very critical to our customers.

Again, I'm not trying to change the definition of the QUORUM consistency
level, instead, we want to improve the quorum read latency, by removing
unnecessary replica requests, which I think can benefit Cassandra users in
general.

I will create a JIRA, and we can move discussions there.


Thanks!
​

On Thu, Jun 8, 2017 at 10:12 PM, Jeff Jirsa <jji...@gmail.com> wrote:

> Short of actually making ConsistencyLevel pluggable or adding/changing one
> of the existing levels, an alternative approach would be to divide up the
> cluster into either real or pseudo-datacenters (with RF=2 in each DC), and
> then write with QUORUM (which would be 3 nodes, across any combination of
> datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
> datacenter of the coordinator). You don't have to have distinct physical
> DCs for this, but you'd need tooling to guarantee an even number of
> replicas in each virtual datacenter.
>
> It's an ugly workaround, but it'd work.
>
> Pluggable CL would be nicer, though.
>
>
> On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <jus...@instaclustr.com>
> wrote:
>
>> Firstly, this situation only occurs if you need strong consistency and are
>> using an even replication factor (RF4, RF6, etc).
>> Secondly, either the read or write still need to be performed at a minimum
>> level of QUORUM. This means there are no extra availability benefits from
>> your proposal (i.e. a minimum of QUORUM replicas still need to be online
>> and available)
>>
>> So the only potential benefit I can think of is a theoretical performance
>> boost. If you write with QUORUM, then you'll need to read with
>> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
>> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
>> most you'd only reduce the number of replicas that the client needs to
>> block on by 1.
>>
>> I'd guess that the performance benefits that you'd gain will probably be
>> quite small - but I'd happily be proven wrong if you feel like running
>> some
>> benchmarks :)
>>
>> Cheers,
>> Justin
>>
>> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dri...@gmail.com> wrote:
>>
>> > I don't disagree with you there and have never liked TWO/THREE.  This is
>> > somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>> >
>> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
>> I'm
>> > also not sure what is.
>> >
>> >
>> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <dikan...@gmail.com> wrote:
>> >
>> >> To me, CL.TWO and CL.THREE are more like work around of the problem,
>> for
>> >> example, they do not work if the number of replicas go to 8, which does
>> >> possible in our environment (2 replicas in each of 4 DCs).
>> >>
>> >> What people want from quorum is strong consistency guarantee, as long
>> as
>> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
>> W=(n/2+1); c)
>> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
>> which
>> >> is the most expensive option.
>> >>
>> >> I can not think of a reason, that people want the quorum read, not for
>> >> strong consistency reason, but just to read from (n/2+1) nodes. If they
>> >> want strong consistency, then the read just needs (n/2) nodes, we are
>> >> purely waste the one extra request, and hurts read latency as well.
>> >>
>> >> Thanks
>> >> Dikang.
>> >>
>> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <n...@thelastpickle.com>
>> >> wrote:
>> >>
>> >>>
>> >>> We have CL.TWO.
>> >>>>
>> >>>>
>> >>>>
>> >>> This was actually the original motivation for CL.TWO and CL.THREE if
>> >>> memory serves:
>> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Dikang
>> >>
>> >>
>> > --
>>
>>
>> *Justin Cameron*Senior Software Engineer
>>
>>
>> <https://www.instaclustr.com/>
>>
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
>> and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not
>> copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>
>


-- 
Dikang

Reply via email to