Re: Cassandra Contributor Meeting to focus on outstanding 4.0 issues

Paulo Motta Tue, 29 Sep 2020 07:53:14 -0700

> Isn't this hi-jacking the meaning (and value) of the "4.0-beta" and
"4.0-rc" fixVersion placeholders?


Makes sense, I hadn't thought of this. I retract my suggestion.

> Kinda agree with Josh here on what the epics should focus on. Personally,
because that better isolates and highlights what's missing from continuous
and automated QA post-4.0, looping back to my first question and concern.

+1

> I thought you were suggesting we do test failures as sub-tasks on a
ticket in 15536 which could work. But having them children of 15536 is just
going to make that noisy enough as to be not useful.

I was actually advocating for the former but I agree we should restrict the
scope of CASSANDRA-15536 single epic.to new improvements as you suggested
and Mick concurred.

>  I'd recommend we rely on JQL for the release scope and kanban board it
for visibility based on fixversion to have our "single pane of glass" for
4.0 progress.

+1

With that said, I think it could be beneficial for visibility to track
flaky/test failures blocking 4.0 on a single epic with fixversion 4.0-rc.

Em ter., 29 de set. de 2020 às 11:38, Joshua McKenzie <
joshua.mcken...@gmail.com> escreveu:

> > I personally prefer to track fail/flaky tests as sub-issue of the 4.0
> epic
>
> > (CASSANDRA-15536) so we can track 4.0 completion status in a single
> place.
>
> Strongly recommend against this approach. If we have hundreds of failing
> upgrade tests (or even dozens) then we end up with a wild mix of scope in
> one epic. Some things 1 day tasks (fix a test), other things multi-week or
> month efforts (scope and build tests for area X).
>
> I thought you were suggesting we do test failures as sub-tasks on a ticket
> in 15536 which could work. But having them children of 15536 is just going
> to make that noisy enough as to be not useful.
>
> I'd recommend we rely on JQL for the release scope and kanban board it for
> visibility based on fixversion to have our "single pane of glass" for 4.0
> progress.
>
> --
> Joshua McKenzie
>
> On Tue, Sep 29, 2020 at 10:07 AM, Paulo Motta < pauloricard...@gmail.com
> > wrote:
>
> >
> >
> >
> > I personally prefer to track fail/flaky tests as sub-issue of the 4.0
> epic
> >
> > (CASSANDRA-15536) so we can track 4.0 completion status in a single
> place.
> >
> >
> >
> >
> > The way I see it is:
> > * CASSANDRA-15536 epic: track everything that needs to be done to wrap-up
> > 4.0 per macro component.
> > * Kanban board: a different view of CASSANDRA-15536, but all issues in
> the
> > Kanban should be ultimately tied to a sub-issue on CASSANDRA-15536.
> > * Component sub-issues: both "new ways to test" + "bugs related to the
> > component"
> > * Test failure sub-issue: group test failures/flakies from any
> components.
> >
> >
> >
> >
> > What do you think?
> >
> >
> >
> > Em ter., 29 de set. de 2020 às 10:47, Josh McKenzie < jmckenzie@
> apache. org
> > ( jmcken...@apache.org ) > escreveu:
> >
> >
> >>
> >>
> >> Not that I know of. Perhaps we should add a new ticket to the quality
> epic
> >> to track flakey and failing tests? (@cc Josh/Jordan)
> >>
> >>
> >>
> >> Either a separate epic or a ticket w/sub-tasks either work well in terms
> >> of organization. There's value in having one place to go to cleanly pull
> >> that kind of work so I have a slight bias towards an independent epic
> for
> >> 4.0 test *fixing* instead of mixing the "new ways to test" with
> "cleaning
> >> up the testing ways we know".
> >>
> >>
> >>
> >> ---
> >> Josh McKenzie
> >>
> >>
> >>
> >> On Tue, Sep 29, 2020 at 8:34 AM, Paulo Motta < pauloricardomg@ gmail.
> com (
> >> pauloricard...@gmail.com ) > wrote:
> >>
> >>
> >>>
> >>>
> >>> Thanks for bringing up these valuable points, Mick! In fact we focused
> on
> >>> the quality epic so far but there is a lot more stuff unaddressed. I
> >>> commented some of the points you brought up below:
> >>>
> >>>
> >>>
> >>> How will we ensure this QA persists, so it's not a manual checklist
> >>>
> >>>
> >>>
> >>> every release?
> >>>
> >>>
> >>>
> >>> This is a great question but I believe it warrants a separate
> discussion
> >>> as part of a larger discussion on improving our development/quality
> >>>
> >>>
> >>
> >>
> >>
> >> process
> >>
> >>
> >>>
> >>>
> >>> post-4.0.
> >>>
> >>>
> >>>
> >>> *** CASSANDRA-15234 – Standardise config and JVM parameters - It looks
> >>>
> >>>
> >>>
> >>> like we have dropped the ball on this.
> >>>
> >>>
> >>>
> >>> It's sad that we dropped the ball on this important change but now I
> >>>
> >>>
> >>
> >>
> >>
> >> think
> >>
> >>
> >>>
> >>>
> >>> it's too late to make these changes as it will bring entropy towards
> >>> stabilizing 4.0. In that sense I think we should postpone this to the
> >>>
> >>>
> >>
> >>
> >>
> >> next
> >>
> >>
> >>>
> >>>
> >>> major and prioritize it earlier in the next cycle.
> >>>
> >>>
> >>>
> >>> Do all remaining flakey and failing units and dtests have jira tickets
> >>>
> >>>
> >>>
> >>> entered for 4.0-beta? Has the same been done, at least with rough
> >>> grouping, for the upgrade tests? Are these tied to the testing epics in
> >>>
> >>>
> >>
> >>
> >>
> >> any
> >>
> >>
> >>>
> >>>
> >>> way?
> >>>
> >>>
> >>>
> >>> Not that I know of. Perhaps we should add a new ticket to the quality
> >>>
> >>>
> >>
> >>
> >>
> >> epic
> >>
> >>
> >>>
> >>>
> >>> to track flakey and failing tests? (@cc Josh/Jordan)
> >>>
> >>>
> >>>
> >>> Has any triage efforts happened here?
> >>>
> >>>
> >>>
> >>> Not that I know of but maybe Josh/Jordan/Jon (J^3) are planning on
> >>>
> >>>
> >>
> >>
> >>
> >> looking
> >>
> >>
> >>>
> >>>
> >>> at it. I can take a stab at triaging some of these tickets.
> >>>
> >>>
> >>>
> >>> Do triaged bugs in this list get moved to fix version "4.x" ?
> >>>
> >>>
> >>>
> >>> I think in the spirit of expediting 4.0RC release we should mark bugs
> >>>
> >>>
> >>
> >>
> >>
> >> with
> >>
> >>
> >>>
> >>>
> >>> low severity (ie. those with a simple workaround) to 4.0.1. Any bug
> with
> >>> medium-high severity should be marked as 4.0-rc to favor stability.
> >>>
> >>>
> >>>
> >>> Are we duplicating efforts in the testing epics when others have
> already
> >>>
> >>>
> >>>
> >>> identified and reported the bugs but we just haven't triage them?
> >>>
> >>>
> >>>
> >>> That's a good point. I think as part of the triaging effort we should
> >>>
> >>>
> >>
> >>
> >>
> >> link
> >>
> >>
> >>>
> >>>
> >>> the bugs to existing quality epics so we can keep track of them.
> >>>
> >>>
> >>>
> >>> Em ter., 29 de set. de 2020 às 06:11, Sam Tunnicliffe < sam@ beobal.
> com (
> >>> s...@beobal.com ) > escreveu:
> >>>
> >>>
> >>>
> >>> On 29 Sep 2020, at 09:50, Mick Semb Wever < mck@ apache. org (
> >>> m...@apache.org ) > wrote:
> >>>
> >>>
> >>>
> >>> Regarding the proposed agenda of going through the unassigned issues to
> >>> improve visibility on what needs to be done to ship 4.0 GA I think this
> >>>
> >>>
> >>>
> >>> is
> >>>
> >>>
> >>>
> >>> a great start but only covers part of the problem.
> >>>
> >>>
> >>>
> >>> I think we have 3 outstanding issues that are hampering visibility of
> >>>
> >>>
> >>>
> >>> 4.0
> >>>
> >>>
> >>>
> >>> progress:
> >>> a) Quality testing issues with no shepherd;
> >>> b) Quality testing issues with shepherd, but no recent activity (~2
> >>>
> >>>
> >>>
> >>> months
> >>>
> >>>
> >>>
> >>> or less);
> >>> c) Quality testing issues with no objective acceptance
> >>>
> >>>
> >>>
> >>> criteria/Definition
> >>>
> >>>
> >>>
> >>> of Done;
> >>>
> >>>
> >>>
> >>> These Quality testing epics are a great focal point. How will we ensure
> >>> this QA persists, so it's not a manual checklist every release?
> >>>
> >>>
> >>>
> >>> The following is what I can see outstanding for the 4.0 release, that
> is
> >>> not afaik attached to these epic tickets…
> >>>
> >>>
> >>>
> >>> ** Those issues that slipped alpha…
> >>> *** CASSANDRA-15299 – CASSANDRA-13304 follow-up: improve checksumming
> and
> >>> compression in protocol v5-beta
> >>> *** CASSANDRA-15234 – Standardise config and JVM parameters
> >>> *** CASSANDRA-13701 – Lower default num_tokens (blocked by
> >>> 'CASSANDRA-16079 Improve dtest runtime' )
> >>> ** 95 jira tickets in 4.0-beta and 4.0-rc
> >>> ** 631 jira bug tickets with no assigned "fix version"
> >>> ** Remaining flakey unit and dtests
> >>> ** Hundreds of failing and flakey upgrade dtests
> >>> ** Reports from driver tests, and other external test systems
> >>> ** Reports and/or integration with Fallout and Harry
> >>>
> >>>
> >>>
> >>> In a bit more detail…
> >>>
> >>>
> >>>
> >>> *** CASSANDRA-15299 – CASSANDRA-13304 follow-up: improve checksumming
> and
> >>> compression in protocol v5-beta
> >>>
> >>>
> >>>
> >>> This looks like it is in its final patch and review. Is that correct
> Sam?
> >>>
> >>>
> >>>
> >>> Yes it is. I hope to get review finished and post some further perf
> >>> numbers this week.
> >>>
> >>>
> >>>
> >>> *** CASSANDRA-15234 – Standardise config and JVM parameters
> >>>
> >>>
> >>>
> >>> It looks like we have dropped the ball on this.
> >>>
> >>>
> >>>
> >>> *** CASSANDRA-13701 – Lower default num_tokens, and CASSANDRA-16079
> >>>
> >>>
> >>>
> >>> Some effort is undergoing from Ekaterina, David, and myself. I've put
> >>> together a prototype for caching bootstrapped ccm clusters, but i'm not
> >>>
> >>>
> >>
> >>
> >>
> >> yet
> >>
> >>
> >>>
> >>>
> >>> sure I can get much savings over the current tests and only a minimal
> >>> saving off the 13701 patch. Berenguer brought up that 40% of the dtests
> >>>
> >>>
> >>
> >>
> >>
> >> are
> >>
> >>
> >>>
> >>>
> >>> single-node, their performance not changed by 13701, and probably
> better
> >>> off rewritten to in-jvm tests.
> >>>
> >>>
> >>>
> >>> ** 95 jira tickets in 4.0-beta and 4.0-rc
> >>> ** Remaining flakey unit and dtests
> >>> ** Hundreds of failing and flakey upgrade dtests
> >>>
> >>>
> >>>
> >>> Do all remaining flakey and failing units and dtests have jira tickets
> >>> entered for 4.0-beta?
> >>> Has the same been done, at least with rough grouping, for the upgrade
> >>>
> >>>
> >>>
> >>> tests?
> >>>
> >>>
> >>>
> >>> Are these tied to the testing epics in any way?
> >>>
> >>>
> >>>
> >>> ** 631 jira bug tickets with no assigned "fix version" (who knows how
> >>>
> >>>
> >>>
> >>> many
> >>>
> >>>
> >>>
> >>> of these are applicable to 4.0?)
> >>>
> >>>
> >>>
> >>> Has any triage efforts happened here?
> >>> Do triaged bugs in this list get moved to fix version "4.x" ? Are we
> >>> duplicating efforts in the testing epics when others have already
> >>> identified and reported the bugs but we just haven't triage them?
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> To
> >>> unsubscribe, e-mail: dev-unsubscribe@ cassandra. apache. org (
> >>> dev-unsubscr...@cassandra.apache.org ) For additional commands,
> e-mail: dev-help@
> >>> cassandra. apache. org ( dev-h...@cassandra.apache.org )
> >>>
> >>>
> >>
> >>
> >
> >
> >

Re: Cassandra Contributor Meeting to focus on outstanding 4.0 issues

Reply via email to