Re: Cassandra Contributor Meeting to focus on outstanding 4.0 issues

Paulo Motta Tue, 29 Sep 2020 07:11:16 -0700

addendum: I understand the original way CASSANDRA-15536 was proposed is not
the way I'm describing, but it could be easily adaptable to that so we can
have a single place to track all tasks related to 4.0 quality and address
some of the visibility points raised by Mick.


Em ter., 29 de set. de 2020 às 11:07, Paulo Motta <[email protected]>
escreveu:

> I personally prefer to track fail/flaky tests as sub-issue of the 4.0 epic
> (CASSANDRA-15536) so we can track 4.0 completion status in a single place.
>
> The way I see it is:
> * CASSANDRA-15536 epic: track everything that needs to be done to wrap-up
> 4.0 per macro component.
> * Kanban board: a different view of CASSANDRA-15536, but all issues in the
> Kanban should be ultimately tied to a sub-issue on CASSANDRA-15536.
> * Component sub-issues: both "new ways to test" + "bugs related to the
> component"
> * Test failure sub-issue: group test failures/flakies from any components.
>
> What do you think?
>
> Em ter., 29 de set. de 2020 às 10:47, Josh McKenzie <[email protected]>
> escreveu:
>
>> Not that I know of. Perhaps we should add a new ticket to the quality epic
>> to track flakey and failing tests? (@cc Josh/Jordan)
>>
>> Either a separate epic or a ticket w/sub-tasks either work well in terms
>> of
>> organization. There's value in having one place to go to cleanly pull that
>> kind of work so I have a slight bias towards an independent epic for 4.0
>> test *fixing* instead of mixing the "new ways to test" with "cleaning up
>> the testing ways we know".
>>
>> ---
>> Josh McKenzie
>>
>>
>>
>> On Tue, Sep 29, 2020 at 8:34 AM, Paulo Motta <[email protected]>
>> wrote:
>>
>> > Thanks for bringing up these valuable points, Mick! In fact we focused
>> on
>> > the quality epic so far but there is a lot more stuff unaddressed. I
>> > commented some of the points you brought up below:
>> >
>> > How will we ensure this QA persists, so it's not a manual checklist
>> >
>> > every release?
>> >
>> > This is a great question but I believe it warrants a separate discussion
>> > as part of a larger discussion on improving our development/quality
>> process
>> > post-4.0.
>> >
>> > *** CASSANDRA-15234 – Standardise config and JVM parameters - It looks
>> >
>> > like we have dropped the ball on this.
>> >
>> > It's sad that we dropped the ball on this important change but now I
>> think
>> > it's too late to make these changes as it will bring entropy towards
>> > stabilizing 4.0. In that sense I think we should postpone this to the
>> next
>> > major and prioritize it earlier in the next cycle.
>> >
>> > Do all remaining flakey and failing units and dtests have jira tickets
>> >
>> > entered for 4.0-beta? Has the same been done, at least with rough
>> > grouping, for the upgrade tests? Are these tied to the testing epics in
>> any
>> > way?
>> >
>> > Not that I know of. Perhaps we should add a new ticket to the quality
>> epic
>> > to track flakey and failing tests? (@cc Josh/Jordan)
>> >
>> > Has any triage efforts happened here?
>> >
>> > Not that I know of but maybe Josh/Jordan/Jon (J^3) are planning on
>> looking
>> > at it. I can take a stab at triaging some of these tickets.
>> >
>> > Do triaged bugs in this list get moved to fix version "4.x" ?
>> >
>> > I think in the spirit of expediting 4.0RC release we should mark bugs
>> with
>> > low severity (ie. those with a simple workaround) to 4.0.1. Any bug with
>> > medium-high severity should be marked as 4.0-rc to favor stability.
>> >
>> > Are we duplicating efforts in the testing epics when others have already
>> >
>> > identified and reported the bugs but we just haven't triage them?
>> >
>> > That's a good point. I think as part of the triaging effort we should
>> link
>> > the bugs to existing quality epics so we can keep track of them.
>> >
>> > Em ter., 29 de set. de 2020 às 06:11, Sam Tunnicliffe <[email protected]>
>> > escreveu:
>> >
>> > On 29 Sep 2020, at 09:50, Mick Semb Wever <[email protected]> wrote:
>> >
>> > Regarding the proposed agenda of going through the unassigned issues to
>> > improve visibility on what needs to be done to ship 4.0 GA I think this
>> >
>> > is
>> >
>> > a great start but only covers part of the problem.
>> >
>> > I think we have 3 outstanding issues that are hampering visibility of
>> >
>> > 4.0
>> >
>> > progress:
>> > a) Quality testing issues with no shepherd;
>> > b) Quality testing issues with shepherd, but no recent activity (~2
>> >
>> > months
>> >
>> > or less);
>> > c) Quality testing issues with no objective acceptance
>> >
>> > criteria/Definition
>> >
>> > of Done;
>> >
>> > These Quality testing epics are a great focal point. How will we ensure
>> > this QA persists, so it's not a manual checklist every release?
>> >
>> > The following is what I can see outstanding for the 4.0 release, that is
>> > not afaik attached to these epic tickets…
>> >
>> > ** Those issues that slipped alpha…
>> > *** CASSANDRA-15299 – CASSANDRA-13304 follow-up: improve checksumming
>> and
>> > compression in protocol v5-beta
>> > *** CASSANDRA-15234 – Standardise config and JVM parameters
>> > *** CASSANDRA-13701 – Lower default num_tokens (blocked by
>> > 'CASSANDRA-16079 Improve dtest runtime' )
>> > ** 95 jira tickets in 4.0-beta and 4.0-rc
>> > ** 631 jira bug tickets with no assigned "fix version"
>> > ** Remaining flakey unit and dtests
>> > ** Hundreds of failing and flakey upgrade dtests
>> > ** Reports from driver tests, and other external test systems
>> > ** Reports and/or integration with Fallout and Harry
>> >
>> > In a bit more detail…
>> >
>> > *** CASSANDRA-15299 – CASSANDRA-13304 follow-up: improve checksumming
>> and
>> > compression in protocol v5-beta
>> >
>> > This looks like it is in its final patch and review. Is that correct
>> Sam?
>> >
>> > Yes it is. I hope to get review finished and post some further perf
>> > numbers this week.
>> >
>> > *** CASSANDRA-15234 – Standardise config and JVM parameters
>> >
>> > It looks like we have dropped the ball on this.
>> >
>> > *** CASSANDRA-13701 – Lower default num_tokens, and CASSANDRA-16079
>> >
>> > Some effort is undergoing from Ekaterina, David, and myself. I've put
>> > together a prototype for caching bootstrapped ccm clusters, but i'm not
>> yet
>> > sure I can get much savings over the current tests and only a minimal
>> > saving off the 13701 patch. Berenguer brought up that 40% of the dtests
>> are
>> > single-node, their performance not changed by 13701, and probably better
>> > off rewritten to in-jvm tests.
>> >
>> > ** 95 jira tickets in 4.0-beta and 4.0-rc
>> > ** Remaining flakey unit and dtests
>> > ** Hundreds of failing and flakey upgrade dtests
>> >
>> > Do all remaining flakey and failing units and dtests have jira tickets
>> > entered for 4.0-beta?
>> > Has the same been done, at least with rough grouping, for the upgrade
>> >
>> > tests?
>> >
>> > Are these tied to the testing epics in any way?
>> >
>> > ** 631 jira bug tickets with no assigned "fix version" (who knows how
>> >
>> > many
>> >
>> > of these are applicable to 4.0?)
>> >
>> > Has any triage efforts happened here?
>> > Do triaged bugs in this list get moved to fix version "4.x" ? Are we
>> > duplicating efforts in the testing epics when others have already
>> > identified and reported the bugs but we just haven't triage them?
>> >
>> > --------------------------------------------------------------------- To
>> > unsubscribe, e-mail: [email protected] For
>> additional
>> > commands, e-mail: [email protected]
>> >
>> >
>>
>

Re: Cassandra Contributor Meeting to focus on outstanding 4.0 issues

Reply via email to