Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-12 Thread Sylvain Lebresne
Regarding option #4, I'll remark that experience tends to suggest users
don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
likely essentially mean "LWT has a correctness issue, but once it broke
your data enough that you'll notice, you'll be able to dig the proper flag
to fix it for next time". I guess it's better than nothing, of course, but
I'll admit that defaulting to "opt-in correctness", especially for a
feature (LWT) that exists uniquely to provide additional guarantees, is
something I have a hard rallying behind.

But a performance regression is a regression, I'm not shrugging it off.
Still, I feel we shouldn't leave LWT with a fairly serious known
correctness bug and I frankly feel bad for "the project" that this has been
known for so long without action, so I'm a bit biased in wanting to get it
fixed asap.

But maybe I'm overstating the urgency here, and maybe option #1 is a better
way forward.

--
Sylvain


Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-12 Thread Joshua McKenzie
Is the new implementation a separate, distinctly modularized new body of
work or does it make substantial changes to existing implementation and
subsume it?

On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne  wrote:

> Regarding option #4, I'll remark that experience tends to suggest users
> don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
> likely essentially mean "LWT has a correctness issue, but once it broke
> your data enough that you'll notice, you'll be able to dig the proper flag
> to fix it for next time". I guess it's better than nothing, of course, but
> I'll admit that defaulting to "opt-in correctness", especially for a
> feature (LWT) that exists uniquely to provide additional guarantees, is
> something I have a hard rallying behind.
>
> But a performance regression is a regression, I'm not shrugging it off.
> Still, I feel we shouldn't leave LWT with a fairly serious known
> correctness bug and I frankly feel bad for "the project" that this has been
> known for so long without action, so I'm a bit biased in wanting to get it
> fixed asap.
>
> But maybe I'm overstating the urgency here, and maybe option #1 is a better
> way forward.
>
> --
> Sylvain
>


Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

2020-11-12 Thread Benedict Elliott Smith
> Is the new implementation a separate, distinctly modularized new body of work

It’s primarily a distinct, modularised and new body of work, however there is 
some shared code that has been modified - namely PaxosState, in which legacy 
code is maintained but modified for compatibility, and the system.paxos table 
(which receives a new column, and slightly modified serialization code).  It is 
conceptually an optimised version of the existing algorithm.

If there's a chance of being of value to 4.0, I can try to put up a patch next 
week alongside a high level description of the changes.

> But a performance regression is a regression, I'm not shrugging it off.

I don't want to give the impression I'm shrugging off the correctness issue 
either. It's a serious issue to fix, but since all successful updates to the 
database are linearizable, I think it's likely that many applications behave 
correctly with the present semantics, or at least encounter only transient 
errors. No doubt many also do not, but I have no idea of the ratio.

The regression isn't itself a simple issue either - depending on the topology 
and message latencies it is not difficult to produce inescapable contention, 
i.e. guaranteed timeouts - that might persist as long as clients continue to 
retry. It could be quite a serious degradation of service to impose on our 
users.

I don't pretend to know the correct way to make a decision balancing these 
considerations, but I am perhaps more concerned about imposing service outages 
than I am temporarily maintaining semantics our users have apparently accepted 
for years - though I absolutely share your embarrassment there.


On 12/11/2020, 12:41, "Joshua McKenzie"  wrote:

Is the new implementation a separate, distinctly modularized new body of
work or does it make substantial changes to existing implementation and
subsume it?

On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne  wrote:

> Regarding option #4, I'll remark that experience tends to suggest users
> don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
> likely essentially mean "LWT has a correctness issue, but once it broke
> your data enough that you'll notice, you'll be able to dig the proper flag
> to fix it for next time". I guess it's better than nothing, of course, but
> I'll admit that defaulting to "opt-in correctness", especially for a
> feature (LWT) that exists uniquely to provide additional guarantees, is
> something I have a hard rallying behind.
>
> But a performance regression is a regression, I'm not shrugging it off.
> Still, I feel we shouldn't leave LWT with a fairly serious known
> correctness bug and I frankly feel bad for "the project" that this has 
been
> known for so long without action, so I'm a bit biased in wanting to get it
> fixed asap.
>
> But maybe I'm overstating the urgency here, and maybe option #1 is a 
better
> way forward.
>
> --
> Sylvain
>



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org