Re: Cassandra Contributor Meeting to focus on outstanding 4.0 issues

Joshua McKenzie Tue, 29 Sep 2020 07:38:23 -0700

> I personally prefer to track fail/flaky tests as sub-issue of the 4.0 epic


> (CASSANDRA-15536) so we can track 4.0 completion status in a single place.

Strongly recommend against this approach. If we have hundreds of failing 
upgrade tests (or even dozens) then we end up with a wild mix of scope in one 
epic. Some things 1 day tasks (fix a test), other things multi-week or month 
efforts (scope and build tests for area X).

I thought you were suggesting we do test failures as sub-tasks on a ticket in 
15536 which could work. But having them children of 15536 is just going to make 
that noisy enough as to be not useful.

I'd recommend we rely on JQL for the release scope and kanban board it for 
visibility based on fixversion to have our "single pane of glass" for 4.0 
progress.

--
Joshua McKenzie

On Tue, Sep 29, 2020 at 10:07 AM, Paulo Motta < pauloricard...@gmail.com > 
wrote:

> 
> 
> 
> I personally prefer to track fail/flaky tests as sub-issue of the 4.0 epic
> 
> (CASSANDRA-15536) so we can track 4.0 completion status in a single place.
> 
> 
> 
> 
> The way I see it is:
> * CASSANDRA-15536 epic: track everything that needs to be done to wrap-up
> 4.0 per macro component.
> * Kanban board: a different view of CASSANDRA-15536, but all issues in the
> Kanban should be ultimately tied to a sub-issue on CASSANDRA-15536.
> * Component sub-issues: both "new ways to test" + "bugs related to the
> component"
> * Test failure sub-issue: group test failures/flakies from any components.
> 
> 
> 
> 
> What do you think?
> 
> 
> 
> Em ter., 29 de set. de 2020 às 10:47, Josh McKenzie < jmckenzie@ apache. org
> ( jmcken...@apache.org ) > escreveu:
> 
> 
>> 
>> 
>> Not that I know of. Perhaps we should add a new ticket to the quality epic
>> to track flakey and failing tests? (@cc Josh/Jordan)
>> 
>> 
>> 
>> Either a separate epic or a ticket w/sub-tasks either work well in terms
>> of organization. There's value in having one place to go to cleanly pull
>> that kind of work so I have a slight bias towards an independent epic for
>> 4.0 test *fixing* instead of mixing the "new ways to test" with "cleaning
>> up the testing ways we know".
>> 
>> 
>> 
>> ---
>> Josh McKenzie
>> 
>> 
>> 
>> On Tue, Sep 29, 2020 at 8:34 AM, Paulo Motta < pauloricardomg@ gmail. com (
>> pauloricard...@gmail.com ) > wrote:
>> 
>> 
>>> 
>>> 
>>> Thanks for bringing up these valuable points, Mick! In fact we focused on
>>> the quality epic so far but there is a lot more stuff unaddressed. I
>>> commented some of the points you brought up below:
>>> 
>>> 
>>> 
>>> How will we ensure this QA persists, so it's not a manual checklist
>>> 
>>> 
>>> 
>>> every release?
>>> 
>>> 
>>> 
>>> This is a great question but I believe it warrants a separate discussion
>>> as part of a larger discussion on improving our development/quality
>>> 
>>> 
>> 
>> 
>> 
>> process
>> 
>> 
>>> 
>>> 
>>> post-4.0.
>>> 
>>> 
>>> 
>>> *** CASSANDRA-15234 – Standardise config and JVM parameters - It looks
>>> 
>>> 
>>> 
>>> like we have dropped the ball on this.
>>> 
>>> 
>>> 
>>> It's sad that we dropped the ball on this important change but now I
>>> 
>>> 
>> 
>> 
>> 
>> think
>> 
>> 
>>> 
>>> 
>>> it's too late to make these changes as it will bring entropy towards
>>> stabilizing 4.0. In that sense I think we should postpone this to the
>>> 
>>> 
>> 
>> 
>> 
>> next
>> 
>> 
>>> 
>>> 
>>> major and prioritize it earlier in the next cycle.
>>> 
>>> 
>>> 
>>> Do all remaining flakey and failing units and dtests have jira tickets
>>> 
>>> 
>>> 
>>> entered for 4.0-beta? Has the same been done, at least with rough
>>> grouping, for the upgrade tests? Are these tied to the testing epics in
>>> 
>>> 
>> 
>> 
>> 
>> any
>> 
>> 
>>> 
>>> 
>>> way?
>>> 
>>> 
>>> 
>>> Not that I know of. Perhaps we should add a new ticket to the quality
>>> 
>>> 
>> 
>> 
>> 
>> epic
>> 
>> 
>>> 
>>> 
>>> to track flakey and failing tests? (@cc Josh/Jordan)
>>> 
>>> 
>>> 
>>> Has any triage efforts happened here?
>>> 
>>> 
>>> 
>>> Not that I know of but maybe Josh/Jordan/Jon (J^3) are planning on
>>> 
>>> 
>> 
>> 
>> 
>> looking
>> 
>> 
>>> 
>>> 
>>> at it. I can take a stab at triaging some of these tickets.
>>> 
>>> 
>>> 
>>> Do triaged bugs in this list get moved to fix version "4.x" ?
>>> 
>>> 
>>> 
>>> I think in the spirit of expediting 4.0RC release we should mark bugs
>>> 
>>> 
>> 
>> 
>> 
>> with
>> 
>> 
>>> 
>>> 
>>> low severity (ie. those with a simple workaround) to 4.0.1. Any bug with
>>> medium-high severity should be marked as 4.0-rc to favor stability.
>>> 
>>> 
>>> 
>>> Are we duplicating efforts in the testing epics when others have already
>>> 
>>> 
>>> 
>>> identified and reported the bugs but we just haven't triage them?
>>> 
>>> 
>>> 
>>> That's a good point. I think as part of the triaging effort we should
>>> 
>>> 
>> 
>> 
>> 
>> link
>> 
>> 
>>> 
>>> 
>>> the bugs to existing quality epics so we can keep track of them.
>>> 
>>> 
>>> 
>>> Em ter., 29 de set. de 2020 às 06:11, Sam Tunnicliffe < sam@ beobal. com (
>>> s...@beobal.com ) > escreveu:
>>> 
>>> 
>>> 
>>> On 29 Sep 2020, at 09:50, Mick Semb Wever < mck@ apache. org (
>>> m...@apache.org ) > wrote:
>>> 
>>> 
>>> 
>>> Regarding the proposed agenda of going through the unassigned issues to
>>> improve visibility on what needs to be done to ship 4.0 GA I think this
>>> 
>>> 
>>> 
>>> is
>>> 
>>> 
>>> 
>>> a great start but only covers part of the problem.
>>> 
>>> 
>>> 
>>> I think we have 3 outstanding issues that are hampering visibility of
>>> 
>>> 
>>> 
>>> 4.0
>>> 
>>> 
>>> 
>>> progress:
>>> a) Quality testing issues with no shepherd;
>>> b) Quality testing issues with shepherd, but no recent activity (~2
>>> 
>>> 
>>> 
>>> months
>>> 
>>> 
>>> 
>>> or less);
>>> c) Quality testing issues with no objective acceptance
>>> 
>>> 
>>> 
>>> criteria/Definition
>>> 
>>> 
>>> 
>>> of Done;
>>> 
>>> 
>>> 
>>> These Quality testing epics are a great focal point. How will we ensure
>>> this QA persists, so it's not a manual checklist every release?
>>> 
>>> 
>>> 
>>> The following is what I can see outstanding for the 4.0 release, that is
>>> not afaik attached to these epic tickets…
>>> 
>>> 
>>> 
>>> ** Those issues that slipped alpha…
>>> *** CASSANDRA-15299 – CASSANDRA-13304 follow-up: improve checksumming and
>>> compression in protocol v5-beta
>>> *** CASSANDRA-15234 – Standardise config and JVM parameters
>>> *** CASSANDRA-13701 – Lower default num_tokens (blocked by
>>> 'CASSANDRA-16079 Improve dtest runtime' )
>>> ** 95 jira tickets in 4.0-beta and 4.0-rc
>>> ** 631 jira bug tickets with no assigned "fix version"
>>> ** Remaining flakey unit and dtests
>>> ** Hundreds of failing and flakey upgrade dtests
>>> ** Reports from driver tests, and other external test systems
>>> ** Reports and/or integration with Fallout and Harry
>>> 
>>> 
>>> 
>>> In a bit more detail…
>>> 
>>> 
>>> 
>>> *** CASSANDRA-15299 – CASSANDRA-13304 follow-up: improve checksumming and
>>> compression in protocol v5-beta
>>> 
>>> 
>>> 
>>> This looks like it is in its final patch and review. Is that correct Sam?
>>> 
>>> 
>>> 
>>> Yes it is. I hope to get review finished and post some further perf
>>> numbers this week.
>>> 
>>> 
>>> 
>>> *** CASSANDRA-15234 – Standardise config and JVM parameters
>>> 
>>> 
>>> 
>>> It looks like we have dropped the ball on this.
>>> 
>>> 
>>> 
>>> *** CASSANDRA-13701 – Lower default num_tokens, and CASSANDRA-16079
>>> 
>>> 
>>> 
>>> Some effort is undergoing from Ekaterina, David, and myself. I've put
>>> together a prototype for caching bootstrapped ccm clusters, but i'm not
>>> 
>>> 
>> 
>> 
>> 
>> yet
>> 
>> 
>>> 
>>> 
>>> sure I can get much savings over the current tests and only a minimal
>>> saving off the 13701 patch. Berenguer brought up that 40% of the dtests
>>> 
>>> 
>> 
>> 
>> 
>> are
>> 
>> 
>>> 
>>> 
>>> single-node, their performance not changed by 13701, and probably better
>>> off rewritten to in-jvm tests.
>>> 
>>> 
>>> 
>>> ** 95 jira tickets in 4.0-beta and 4.0-rc
>>> ** Remaining flakey unit and dtests
>>> ** Hundreds of failing and flakey upgrade dtests
>>> 
>>> 
>>> 
>>> Do all remaining flakey and failing units and dtests have jira tickets
>>> entered for 4.0-beta?
>>> Has the same been done, at least with rough grouping, for the upgrade
>>> 
>>> 
>>> 
>>> tests?
>>> 
>>> 
>>> 
>>> Are these tied to the testing epics in any way?
>>> 
>>> 
>>> 
>>> ** 631 jira bug tickets with no assigned "fix version" (who knows how
>>> 
>>> 
>>> 
>>> many
>>> 
>>> 
>>> 
>>> of these are applicable to 4.0?)
>>> 
>>> 
>>> 
>>> Has any triage efforts happened here?
>>> Do triaged bugs in this list get moved to fix version "4.x" ? Are we
>>> duplicating efforts in the testing epics when others have already
>>> identified and reported the bugs but we just haven't triage them?
>>> 
>>> 
>>> 
>>> --------------------------------------------------------------------- To
>>> unsubscribe, e-mail: dev-unsubscribe@ cassandra. apache. org (
>>> dev-unsubscr...@cassandra.apache.org ) For additional commands, e-mail: 
>>> dev-help@
>>> cassandra. apache. org ( dev-h...@cassandra.apache.org )
>>> 
>>> 
>> 
>> 
> 
> 
>

Re: Cassandra Contributor Meeting to focus on outstanding 4.0 issues

Reply via email to