We were able to narrow it down, and it seems that both issues have been introduced by [1] (both tests pass without this commit). There is a preliminary fix, and we're working on a minimal repro. Please track [2] for more information and latest updates.
[1] https://github.com/apache/cassandra/commit/b7e1e44a90 [2] https://issues.apache.org/jira/browse/CASSANDRA-18932 On Sat, Nov 4, 2023, at 7:52 PM, Mick Semb Wever wrote: > > Please mark such bugs with fixVersion 5.0-beta > > If there are no more tickets that need API changes (i.e. those that should be > marked fixVersion 5.0-alpha) this then indicates we do not need a 5.0-alpha3 > release and can focus towards 5.0-beta1 (regardless of having blockers open > to it). > > Appreciate the attention 18993 is getting – we do have a shortlist of beta > blockers that we gotta prioritise ! > > > On Sat, 4 Nov 2023 at 18:33, Benedict <bened...@apache.org> wrote: >> >> Yep, data loss bugs are not any old bug. I’m concretely -1 (binding) >> releasing a beta with one that’s either under investigation or confirmed. >> >> As Scott says, hopefully it won’t come to that - the joy of deterministic >> testing is this should be straightforward to triage. >> >> >>> On 4 Nov 2023, at 17:30, C. Scott Andreas <sc...@paradoxica.net> wrote: >>> I’d happily be the first to vote -1(nb) on a release containing a known >>> and reproducible bug that can result in data loss or an incorrect response >>> to a query. And I certainly wouldn’t run it. >>> >>> Since we have a programmatic repro within just a few seconds, this should >>> not take long to root-cause. >>> >>> On Friday, Alex worked to get this reproducing on a Cassandra branch rather >>> than via unstaged changes. We should have a published / shareable example >>> with details near the beginning of the week. >>> >>> – Scott >>> >>>> On Nov 4, 2023, at 10:17 AM, Josh McKenzie <jmcken...@apache.org> wrote: >>>> >>>>> I think before we cut a beta we need to have diagnosed and fixed 18993 >>>>> (assuming it is a bug). >>>> Before a beta? I could see that for rc or GA definitely, but having a >>>> known (especially non-regressive) data loss bug in a beta seems like it's >>>> compatible with the guarantees we're providing for it: >>>> https://cwiki.apache.org/confluence/display/CASSANDRA/Release+Lifecycle >>>> >>>>> This release is recommended for test/QA clusters where short(order of >>>>> minutes) downtime during upgrades is not an issue >>>> >>>> >>>> On Sat, Nov 4, 2023, at 12:56 PM, Ekaterina Dimitrova wrote: >>>>> Totally agree with the others. Such an issue on its own should be a >>>>> priority in any release. Looking forward to the reproduction test >>>>> mentioned on the ticket. >>>>> >>>>> Thanks to Alex for his work on harry! >>>>> >>>>> On Sat, 4 Nov 2023 at 12:47, Benedict <bened...@apache.org> wrote: >>>>>> Alex can confirm but I think it actually turns out to be a new bug in >>>>>> 5.0, but either way we should not cut a release with such a serious >>>>>> potential known issue. >>>>>> >>>>>> > On 4 Nov 2023, at 16:18, J. D. Jordan <jeremiah.jor...@gmail.com> >>>>>> > wrote: >>>>>> > >>>>>> > Sounds like 18993 is not a regression in 5.0? But present in 4.1 as >>>>>> > well? So I would say we should fix it with the highest priority and >>>>>> > get a new 4.1.x released. Blocking 5.0 beta voting is a secondary >>>>>> > issue to me if we have a “data not being returned” issue in an >>>>>> > existing release? >>>>>> > >>>>>> >> On Nov 4, 2023, at 11:09 AM, Benedict <bened...@apache.org> wrote: >>>>>> >> >>>>>> >> I think before we cut a beta we need to have diagnosed and fixed >>>>>> >> 18993 (assuming it is a bug). >>>>>> >> >>>>>> >>>> On 4 Nov 2023, at 16:04, Mick Semb Wever <m...@apache.org> wrote: >>>>>> >>> >>>>>> >>> >>>>>> >>>> >>>>>> >>>> With the publication of this release I would like to switch the >>>>>> >>>> default 'latest' docs on the website from 4.1 to 5.0. Are there any >>>>>> >>>> objections to this ? >>>>>> >>> >>>>>> >>> >>>>>> >>> I would also like to propose the next 5.0 release to be 5.0-beta1 >>>>>> >>> >>>>>> >>> With the aim of reaching GA for the Summit, I would like to suggest >>>>>> >>> we >>>>>> >>> work towards the best-case scenario of 5.0-beta1 in two weeks and >>>>>> >>> 5.0-rc1 first week Dec. >>>>>> >>> >>>>>> >>> I know this is a huge ask with lots of unknowns we can't actually >>>>>> >>> commit to. But I believe it is a worthy goal, and possible if >>>>>> >>> nothing >>>>>> >>> sideswipes us – but we'll need all the help we can get this month to >>>>>> >>> make it happen. >>>>>> >> >>>>