Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Benedict Elliott Smith Thu, 12 Nov 2020 06:31:26 -0800

> Is the new implementation a separate, distinctly modularized new body of work

It’s primarily a distinct, modularised and new body of work, however there is 
some shared code that has been modified - namely PaxosState, in which legacy 
code is maintained but modified for compatibility, and the system.paxos table 
(which receives a new column, and slightly modified serialization code).  It is 
conceptually an optimised version of the existing algorithm.

If there's a chance of being of value to 4.0, I can try to put up a patch next 
week alongside a high level description of the changes.

> But a performance regression is a regression, I'm not shrugging it off.

I don't want to give the impression I'm shrugging off the correctness issue 
either. It's a serious issue to fix, but since all successful updates to the 
database are linearizable, I think it's likely that many applications behave 
correctly with the present semantics, or at least encounter only transient 
errors. No doubt many also do not, but I have no idea of the ratio.

The regression isn't itself a simple issue either - depending on the topology 
and message latencies it is not difficult to produce inescapable contention, 
i.e. guaranteed timeouts - that might persist as long as clients continue to 
retry. It could be quite a serious degradation of service to impose on our 
users.

I don't pretend to know the correct way to make a decision balancing these 
considerations, but I am perhaps more concerned about imposing service outages 
than I am temporarily maintaining semantics our users have apparently accepted 
for years - though I absolutely share your embarrassment there.

On 12/11/2020, 12:41, "Joshua McKenzie" <jmcken...@apache.org> wrote:

    Is the new implementation a separate, distinctly modularized new body of
    work or does it make substantial changes to existing implementation and
    subsume it?

    On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <lebre...@gmail.com> wrote:

    > Regarding option #4, I'll remark that experience tends to suggest users
    > don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
    > likely essentially mean "LWT has a correctness issue, but once it broke
    > your data enough that you'll notice, you'll be able to dig the proper flag
    > to fix it for next time". I guess it's better than nothing, of course, but
    > I'll admit that defaulting to "opt-in correctness", especially for a
    > feature (LWT) that exists uniquely to provide additional guarantees, is
    > something I have a hard rallying behind.
    >
    > But a performance regression is a regression, I'm not shrugging it off.
    > Still, I feel we shouldn't leave LWT with a fairly serious known
    > correctness bug and I frankly feel bad for "the project" that this has 
been
    > known for so long without action, so I'm a bit biased in wanting to get it
    > fixed asap.
    >
    > But maybe I'm overstating the urgency here, and maybe option #1 is a 
better
    > way forward.
    >
    > --
    > Sylvain
    >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Reply via email to