Re: [DISCUSS] Considering when to push tickets out of 4.0

Benjamin Lerer Wed, 17 Jun 2020 01:58:19 -0700

Just to clarify the status of CASSANDRA-14825
The latest version of the patch has been reviewed by Dinesh and I. I am
fixing the last details (mainly the documentation). So I expect the patch
to be ready to commit, today or tomorrow.


On Wed, Jun 17, 2020 at 10:36 AM Benedict Elliott Smith <[email protected]>
wrote:

> If these tickets are the only blockers I agree with Scott's assessment.
> We could even disable the v5 protocol if we're keen to get it out of the
> door today, and only enable it once 15299 lands.  I don't personally think
> the other two tickets would be impossible to land during a beta either,
> even if they are API affecting - they should be backwards compatible after
> all.
>
> > [Josh] however historically on the project we've had a large number of
> defects surfaced by a diverse collection of users
> > [Scott] this was in part a case of a pressing need to investigate a
> potential 3.0 data resurrection issue drawing attention from 4.0
>
> This is a really common theme with 4.0, whose timeline has been hit
> primarily because of issues still circulating with the 3.0 line that were
> never discovered by testing or user reports during beta, RC, or four years
> of releases.  My personal view, informed by this, is that we _didn't find_
> the most serious bugs historically, even with user reports, and we need to
> be honest with ourselves about this in order to plot a route forwards to
> high quality releases.  We cannot _depend_ on community feedback for
> determining release quality; we need a plan to consciously deliver it
> ourselves.
>
>
> On 17/06/2020, 05:12, "Scott Andreas" <[email protected]> wrote:
>
>     I'll take attribution for the delay in comment on 15299; this was in
> part a case of a pressing need to investigate a potential 3.0 data
> resurrection issue drawing attention from 4.0.
>
>     I agree with the statement that we shouldn't consider protocol V5
> ready for finalization in its current form. If we feel that this ticket
> alone is what delays release of beta and are comfortable with a release
> note caveating that one V5 ticket remains before the new protocol is
> finalized, that could be a reasonable compromise.
>
>     I don't have especially strong feelings re: 15146 and 14825 and think
> these are reasonable candidates for deferral.
>
>     ________________________________________
>     From: Joshua McKenzie <[email protected]>
>     Sent: Tuesday, June 16, 2020 4:08 PM
>     To: [email protected]
>     Subject: Re: [DISCUSS] Considering when to push tickets out of 4.0
>
>     I completely respect and agree with the need for a drumbeat to change
> our
>     culture around testing and quality; I also agree we haven't done much
> to
>     materially change that uniquely to 4.0. The 40_quality_testing epic is
> our
>     first step in that direction though I have some personal concerns about
>     leaning on bespoke manual testing for quality since we humans are
>     infinitely fallible. :)
>
>     What elicited that response from me is the claim that we haven't yet
> tested
>     the software, implicitly invalidating the time and energy the
> community has
>     put into that thus far. I wouldn't argue that we've adequately tested
> for a
>     GA release, certainly, but we're discussing beta in this thread. As a
>     project, the advice we have about the testing and usage of the beta is
>     something along the lines of "use this in test/QA and only in cases
> where
>     minutes of downtime is acceptable." Perhaps we should consider
> revising the
>     release lifecycle on the wiki if this is something we're not aligned
> on?
>
>     To your point above, the problems found to date were largely with 3.0
> and
>     found by user report and not by project developer testing. The sooner
> we
>     can get the 4.0 beta into the hands of the community, the sooner we
> can get
>     more of those reports while we also work to broaden and deepen our
>     programmatic testing frameworks and platforms. (To acknowledge: I
> presume
>     that a majority of the user testing that surfaced defects in 3.0 came
> from
>     one large user's investment of time and resources, however
> historically on
>     the project we've had a large number of defects surfaced by a diverse
>     collection of users and I'd like to see us move in that direction
> again for
>     the long-term health of the project. Hence my attempts to move us
> towards
>     beta and take on an awareness campaign and call to action for the
> community
>     to engage in testing.)
>
>
>     On Tue, Jun 16, 2020 at 6:37 PM Benedict Elliott Smith <
> [email protected]>
>     wrote:
>
>     > > Further, we have thousands of tests across all our suites
>     >
>     > I think most here would agree that our testing remains inadequate,
> and
>     > that this (modest, even in pure numerical terms for such a large
> project)
>     > number of often poorly-written unit tests does not really change
> that fact.
>     >
>     > Most of the problems found to date have been found with 3.0, not
> with 4.0,
>     > and found by user report.  We agreed a long time ago that we would
> aim for
>     > 4.0 to be a more stable release than any prior.  Today I think the
> only
>     > reason that might be true is the amount of work invested in fixing
> problems
>     > found in _earlier releases_, not due to verification of 4.0.
>     >
>     > I say this not to influence the decision about when and what lands in
>     > beta, only to ensure we stay honest with ourselves about our
> progress on
>     > quality.  I hope the software itself is higher quality today, but I
> do not
>     > believe it is honest to (yet) claim that our testing is significantly
>     > higher quality than those releases we all agree were inadequate.
> There
>     > exists some wider external use case testing, but being mostly
> invisible to
>     > the community it is unclear how much broader our coverage is with
> these
>     > included.
>     >
>     > On 16/06/2020, 23:08, "David Capwell" <[email protected]>
> wrote:
>     >
>     >     Inline
>     >
>     >     > On Jun 16, 2020, at 2:17 PM, Joshua McKenzie <
> [email protected]>
>     > wrote:
>     >     >
>     >     >>
>     >     >> we still produce incorrect results as shown by
> CASSANDRA-15313;
>     > this is a
>     >     >> correctness issue, so must be a blocker for v5 protocol.
>     >     >
>     >     > That makes complete sense; I'd somehow missed the incorrect
> results
>     > aspect
>     >     > in trying to get context on the work. I'd be eager to hear
> about
>     > progress
>     >     > on it as well.
>     >     >
>     >     > Regarding the question of "why would users test if we haven't
> tested
>     > yet",
>     >     > I respectfully disagree both on the assertion we haven't
> tested yet
>     > as well
>     >     > as on the distinction between an "us vs. them" in the
> community.
>     > We're all
>     >     > users and participants in the Cassandra community and
> ecosystem so
>     > anyone
>     >     > downloading the DB to test it out is just as vital as one of
> us from
>     > the
>     >     > dev list, committer list, or pmc list testing out the DB.
>     >
>     >     I apologies if I came off discriminatory, I will try to absorb
> your
>     > words carefully; thank you for correcting my behavior.
>     >
>     >     > While we can
>     >     > reasonably expect a dev paid full time working on the project
> with a
>     > large
>     >     > amount of infrastructure doing testing to be crucial to
> getting a
>     > release
>     >     > out and doing certain kinds of testing, there are literally
>     > thousands of
>     >     > different companies out in the world basing their critical
>     > infrastructure
>     >     > on this project and them testing out their use-cases and
> migration
>     > is just
>     >     > as critical to this release being ready. It takes a village.
>     >
>     >     I do agree that user validation is important for the release, I
> was
>     > mostly trying to question why start here before the testing work in
> JIRA is
>     > complete.  Maybe I am in the wrong, I have been heads down working
> on data
>     > corruption issues in 3.x; I have become more risk adverse.
>     >
>     >     >
>     >     > Further, we have thousands of tests across all our suites,
> hundreds
>     > of new
>     >     > use-case testing that has been done against 4.0 at this point,
> and
>     > 30+%
>     >     > more bugs fixed in this release than 3.0; the blanket
> assertion that
>     > we
>     >     > haven't tested 4.0 yet doesn't resonate with me. While we
> haven't
>     > done the
>     >     > entirety of our final 40 beta phase testing yet, testing is
>     > constantly
>     >     > going on against this codebase by both people on the ML and
> off.
>     >     >
>     >     > Now, if there are major known glaring issues where we have
> problems
>     > that
>     >     > would prevent users from actually testing out the beta and
> kicking
>     > the
>     >     > tires, that's a different story entirely and I'd argue those
> tickets
>     > should
>     >     > be reflected in the alpha phase (see: CASSANDRA-15299
> apparently ;) )
>     >     >
>     >     > Does that make sense?
>     >
>     >     I have been meaning to ask this, mostly asking people in Slack
> and
>     > this actually confuses me.
>     >
>     >     I was working off the assumption that the fix version meant it
> was a
>     > blocker for that release, and that Alpha special cased and would have
>     > releases even with blocking issues (which is documented in the
> Release
>     > Lifecycle).  When I ask around I hear that this is not correct and
> that
>     > alpha means “blocks beta”, beta means “blocks RC”, etc (is any of
> this
>     > documented, I couldn’t find any last time I was talking to others
> about
>     > this).
>     >
>     >     Now, lets say we close alpha and cut a beta release, my
> understanding
>     > is that tickets which block the next beta release are alpha…. So do
> we
>     > still mark them alpha (even though we won’t have a alpha release)?
>     >
>     >     This has been confusing me since beta has a lot of work pending…
> sorry
>     > for not bring this up in a dedicated dev@ thread
>     >
>     >
>     >     >
>     >     > On Tue, Jun 16, 2020 at 4:58 PM Benedict Elliott Smith <
>     > [email protected]>
>     >     > wrote:
>     >     >
>     >     >> So, if it helps matters: I am explicitly -1 the prior version
> of
>     > this work
>     >     >> due to the technical concerns expressed here and on the
> ticket.  So
>     > we
>     >     >> either need to revert that patch or incorporate 15299.
>     >     >>
>     >     >> On 16/06/2020, 21:48, "Mick Semb Wever" <[email protected]>
> wrote:
>     >     >>
>     >     >>>
>     >     >>> 2) Alternatively, it's been 3 years, 4 months, 13 days since
> the
>     >     >> release of
>     >     >>> 3.10.0 (the last time we added new features to the DB)
>     >     >>>
>     >     >>
>     >     >>
>     >     >>    We did tick-tock, pushing feature releases too quickly, and
>     > without
>     >     >>    supporting them for long enough to get stable. And then
> we've
>     > done "a
>     >     >> la no
>     >     >>    feature releases" for over 3 years. It feels like the bar
> went
>     > from
>     >     >> too low
>     >     >>    to too high.
>     >     >>
>     >     >>    I understand the importance of CASSANDRA-15299. But it
> hasn't
>     > had any
>     >     >>    comments in 12 twelve days, and in this stage of the
> feature
>     > freeze,
>     >     >> with
>     >     >>    so few alpha bugs remaining, that's a long time. Sam, can
> you
>     > speak to
>     >     >> its
>     >     >>    eta?
>     >     >>
>     >     >>
>     >     >>
>     >     >>> 4) If we plan on releasing 4.1 six months after the release
> of 4.0
>     >     >> (i.e.
>     >     >>> calender scope vs. feature scope - not yet agreed upon but an
>     >     >> option),
>     >     >>
>     >     >>
>     >     >>
>     >     >>    I like this. I think it's worth appreciating the different
>     >     >> perspectives of
>     >     >>    this community: those involved with private clusters that
> don't
>     > rely on
>     >     >>    official releases, versus those involved with the public
> and
>     > other
>     >     >> people's
>     >     >>    clusters. The latter group needs those official releases
> much
>     > more, but
>     >     >>    this also ties into putting those users more in focus and
>     > figuring out
>     >     >>    where the bar best sits. This isn't meant to divide, we
> all care
>     > and
>     >     >> voice
>     >     >>    for the user, but just to utilise the different strengths
>     > brought to
>     >     >> the
>     >     >>    table.
>     >     >>
>     >     >>
>     >     >>> If we want 4.0.0 out faster, the biggest gains would be to
> get the
>     >     >> test
>     >     >>    plans written up and get more people working on automated
>     > testing.
>     >     >>
>     >     >>
>     >     >>    Yes, 110%.  Though, as long as this continues to improve,
> as it
>     > has,
>     >     >> does
>     >     >>    it need to be a blocker on 4.0?
>     >     >>
>     >     >>
>     >     >>
>     >     >>
>     > ---------------------------------------------------------------------
>     >     >> To unsubscribe, e-mail: [email protected]
>     >     >> For additional commands, e-mail:
> [email protected]
>     >     >>
>     >     >>
>     >
>     >
>     >
>  ---------------------------------------------------------------------
>     >     To unsubscribe, e-mail: [email protected]
>     >     For additional commands, e-mail: [email protected]
>     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: [email protected]
>     > For additional commands, e-mail: [email protected]
>     >
>     >
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: [email protected]
>     For additional commands, e-mail: [email protected]
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: [DISCUSS] Considering when to push tickets out of 4.0

Reply via email to