Thank you Sylvain and Benedict for the patch and thank you to everybody that took the time to contribute to this discussion :-)
On Fri, Nov 27, 2020 at 5:15 PM Sylvain Lebresne <lebre...@gmail.com> wrote: > I hope I haven't misread this, but it appears we've reached a kind of > consensus for committing the fix, so I went ahead and did it. > I added a NEWS entry that I hope is clear (and points to the flag that > disables the fix if someone wants to go that route), but any committers can > feel free to ninja-nitpick that NEWS entry if they so wish. > > Many thanks to Benjamin for driving the discussion here. > -- > Sylvain > > > On Tue, Nov 24, 2020 at 3:43 PM Ekaterina Dimitrova <e.dimitr...@gmail.com > > > wrote: > > > I am +1 on Benjamin’s proposal > > and less interruptions during upgrades. For more visibility maybe we can > > also write a short article about the options and the tradeoffs, further > to > > NEWS.txt (that’s not something to decide now, of course :-) ) > > > > > > On Tue, 24 Nov 2020 at 9:13, Benjamin Lerer <benjamin.le...@datastax.com > > > > wrote: > > > > > Paulo, what you propose with the yaml seems different from default to > > > *correctness*. It means to me that we are forcing the user to choose > > > between *correctness *and *performance*. Most of us have a good > > > understanding of the problem and it is a hard choice for us. I imagine > > that > > > most of the users do not fully understand LWTs and will not know what > to > > > choose. Some might not even use LWTs and will suddenly be forced to > make > > a > > > choice that they do not understand. It does not feel right to me to > push > > > them to make that choice. > > > > > > I also agree with Benedict and Mick that it is a risky thing to do. > > > > > > something that can bring a cluster down upon an unprepared user. > > > > > > > > > I do not think that it will be the case (feel free to correct me > > Benedict). > > > The impact will probably be an increase in the number of write/read > > > timeouts for the LWTs read/writes. For a heavy load that would cause > the > > > services depending on those queries to become unreliable. On the other > > hand > > > the impact of the current problem is that we can hit some correctness > > issue > > > without even knowing it. > > > > > > We need to choose between two imperfect solutions and we have some > > > difficulties to agree on which one to choose. > > > > > > Benedict suggested that Sylvain and I made the choice. Sylvain did not > > want > > > to make the final call. > > > I chose correctness. If it is a problem and people prefer to vote. It > is > > > perfectly fine for me too :-) > > > > > > I just want us to move forward. > > > > > > > > > > > > On Tue, Nov 24, 2020 at 12:52 PM Mick Semb Wever <m...@apache.org> > wrote: > > > > > > > > I think the keyword there is "normally" - if we can't say > > _certainly_, > > > > > then this is probably an unsafe change to make. > > > > > > > > > > I can imagine any number of hacky upgrade processes that would be > > > > > dangerous with this change. > > > > > > > > > > > > > > > > > I agree. We just don't know what users are doing, this is risky. > > > > > > > > IMO the same applies to a performance degradation, i.e. something > that > > > can > > > > bring a cluster down upon an unprepared user. Despite our best > efforts > > > with > > > > NEWS.txt we should still look after such users. IMHO the imperfection > > of > > > > LWTs on past branches we have to carry. I'm well aware this is easier > > > said > > > > than done, even for far simpler changes. Having the flag there to > > switch > > > to > > > > "correct LWT" is still a huge win for users. > > > > > > > > > >