Re: [DISCUSS] Releasable trunk and quality

Berenguer Blasi Mon, 06 Dec 2021 22:09:01 -0800

+1. I would add a 'post-commit' step: check the jenkins CI run for your
merge and see if sthg broke regardless.


On 6/12/21 23:51, Ekaterina Dimitrova wrote:
> Hi Josh,
> All good questions, thank you for raising this topic.
> To the best of my knowledge, we don't have those documented but I will put
> notes on what tribal knowledge I know about and I personally follow :-)
>
>  Pre-commit test suites: * Which JDK's?  - both are officially supported so
> both.
>
> * When to include all python tests or do JVM only (if ever)? - if I test
> only a test fix probably
>
>  * When to run upgrade tests? - I haven't heard any definitive guideline.
> Preferably every time but if there is a tiny change I guess it can be
> decided for them to be skipped. I would advocate to do more than less.
>
> * What to do if a test is also failing on the reference root (i.e. trunk,
> cassandra-4.0, etc)? - check if a ticket exists already, if not - open one
> at least, even if I don't plan to work on it at least to acknowledge
> the issue and add any info I know about. If we know who broke it, ping the
> author to check it.
>
> * What to do if a test fails intermittently? - Open a ticket. During
> investigation - Use the CircleCI jobs for running tests in a loop to find
> when it fails or to verify the test was fixed. (This is already in my draft
> CircleCI document, not yet released as it was pending on the documents
> migration.)
>
> Hope that helps.
>
> ~Ekaterina
>
> On Mon, 6 Dec 2021 at 17:20, Joshua McKenzie <jmcken...@apache.org> wrote:
>
>> As I work through the scripting on this, I don't know if we've documented
>> or clarified the following (don't see it here:
>> https://cassandra.apache.org/_/development/testing.html):
>>
>> Pre-commit test suites:
>> * Which JDK's?
>> * When to include all python tests or do JVM only (if ever)?
>> * When to run upgrade tests?
>> * What to do if a test is also failing on the reference root (i.e. trunk,
>> cassandra-4.0, etc)?
>> * What to do if a test fails intermittently?
>>
>> I'll also update the above linked documentation once we hammer this out and
>> try and bake it into the scripting flow as much as possible as well. Goal
>> is to make it easy to do the right thing and hard to do the wrong thing,
>> and to have these things written down rather than have it be tribal
>> knowledge that varies a lot across the project.
>>
>> ~Josh
>>
>> On Sat, Dec 4, 2021 at 9:04 AM Joshua McKenzie <jmcken...@apache.org>
>> wrote:
>>
>>> After some offline collab, here's where this thread has landed on a
>>> proposal to change our processes to incrementally improve our processes
>> and
>>> hopefully stabilize the state of CI longer term:
>>>
>>> Link:
>>>
>> https://docs.google.com/document/d/1tJ-0K7d6PIStSbNFOfynXsD9RRDaMgqCu96U4O-RT84/edit#bookmark=id.16oxqq30bby4
>>> Hopefully the mail server doesn't butcher formatting; if it does, hit up
>>> the gdoc and leave comments there as should be open to all.
>>>
>>> Phase 1:
>>> Document merge criteria; update circle jobs to have a simple pre-merge
>> job
>>> (one for each JDK profile)
>>>      * Donate, document, and formalize usage of circleci-enable.py in ASF
>>> repo (need new commit scripts / dev tooling section?)
>>>         * rewrites circle config jobs to simple clear flow
>>>         * ability to toggle between "run on push" or "click to run"
>>>         * Variety of other functionality; see below
>>> Document (site, help, README.md) and automate via scripting the
>>> relationship / dev / release process around:
>>>     * In-jvm dtest
>>>     * dtest
>>>     * ccm
>>> Integrate and document usage of script to build CI repeat test runs
>>>     * circleci-enable.py --repeat-unit org.apache.cassandra.SomeTest
>>>     * Document “Do this if you add or change tests”
>>> Introduce “Build Lead” role
>>>     * Weekly rotation; volunteer
>>>     * 1: Make sure JIRAs exist for test failures
>>>     * 2: Attempt to triage new test failures to root cause and assign out
>>>     * 3: Coordinate and drive to green board on trunk
>>> Change and automate process for *trunk only* patches:
>>>     * Block on green CI (from merge criteria in CI above; potentially
>>> stricter definition of "clean" for trunk CI)
>>>     * Consider using github PR’s to merge (TODO: determine how to handle
>>> circle + CHANGES; see below)
>>> Automate process for *multi-branch* merges
>>>     * Harden / contribute / document dcapwell script (has one which does
>>> the following):
>>>         * rebases your branch to the latest (if on 3.0 then rebase
>> against
>>> cassandra-3.0)
>>>         * check compiles
>>>         * removes all changes to .circle (can opt-out for circleci
>> patches)
>>>         * removes all changes to CHANGES.txt and leverages JIRA for the
>>> content
>>>         * checks code still compiles
>>>         * changes circle to run ci
>>>         * push to a temp branch in git and run CI (circle + Jenkins)
>>>             * when all branches are clean (waiting step is manual)
>>>             * TODO: Define “clean”
>>>                 * No new test failures compared to reference?
>>>                 * Or no test failures at all?
>>>             * merge changes into the actual branches
>>>             * merge up changes; rewriting diff
>>>             * push --atomic
>>>
>>> Transition to phase 2 when:
>>>     * All items from phase 1 are complete
>>>     * Test boards for supported branches are green
>>>
>>> Phase 2:
>>> * Add Harry to recurring run against trunk
>>> * Add Harry to release pipeline
>>> * Suite of perf tests against trunk recurring
>>>
>>>
>>>
>>> On Wed, Nov 17, 2021 at 1:42 PM Joshua McKenzie <jmcken...@apache.org>
>>> wrote:
>>>
>>>> Sorry for not catching that Benedict, you're absolutely right. So long
>> as
>>>> we're using merge commits between branches I don't think auto-merging
>> via
>>>> train or blocking on green CI are options via the tooling, and
>> multi-branch
>>>> reverts will be something we should document very clearly should we even
>>>> choose to go that route (a lot of room to make mistakes there).
>>>>
>>>> It may not be a huge issue as we can expect the more disruptive changes
>>>> (i.e. potentially destabilizing) to be happening on trunk only, so
>> perhaps
>>>> we can get away with slightly different workflows or policies based on
>>>> whether you're doing a multi-branch bugfix or a feature on trunk. Bears
>>>> thinking more deeply about.
>>>>
>>>> I'd also be game for revisiting our merge strategy. I don't see much
>>>> difference in labor between merging between branches vs. preparing
>> separate
>>>> patches for an individual developer, however I'm sure there's
>> maintenance
>>>> and integration implications there I'm not thinking of right now.
>>>>
>>>> On Wed, Nov 17, 2021 at 12:03 PM bened...@apache.org <
>> bened...@apache.org>
>>>> wrote:
>>>>
>>>>> I raised this before, but to highlight it again: how do these
>> approaches
>>>>> interface with our merge strategy?
>>>>>
>>>>> We might have to rebase several dependent merge commits and want to
>>>>> merge them atomically. So far as I know these tools don’t work
>>>>> fantastically in this scenario, but if I’m wrong that’s fantastic. If
>> not,
>>>>> given how important these things are, should we consider revisiting our
>>>>> merge strategy?
>>>>>
>>>>> From: Joshua McKenzie <jmcken...@apache.org>
>>>>> Date: Wednesday, 17 November 2021 at 16:39
>>>>> To: dev@cassandra.apache.org <dev@cassandra.apache.org>
>>>>> Subject: Re: [DISCUSS] Releasable trunk and quality
>>>>> Thanks for the feedback and insight Henrik; it's valuable to hear how
>>>>> other
>>>>> large complex infra projects have tackled this problem set.
>>>>>
>>>>> To attempt to summarize, what I got from your email:
>>>>> [Phase one]
>>>>> 1) Build Barons: rotation where there's always someone active tying
>>>>> failures to changes and adding those failures to our ticketing system
>>>>> 2) Best effort process of "test breakers" being assigned tickets to fix
>>>>> the
>>>>> things their work broke
>>>>> 3) Moving to a culture where we regularly revert commits that break
>> tests
>>>>> 4) Running tests before we merge changes
>>>>>
>>>>> [Phase two]
>>>>> 1) Suite of performance tests on a regular cadence against trunk
>>>>> (w/hunter
>>>>> or otherwise)
>>>>> 2) Integration w/ github merge-train pipelines
>>>>>
>>>>> That cover the highlights? I agree with these points as useful places
>> for
>>>>> us to invest in as a project and I'll work on getting this into a gdoc
>>>>> for
>>>>> us to align on and discuss further this week.
>>>>>
>>>>> ~Josh
>>>>>
>>>>>
>>>>> On Wed, Nov 17, 2021 at 10:23 AM Henrik Ingo <henrik.i...@datastax.com
>>>>> wrote:
>>>>>
>>>>>> There's an old joke: How many people read Slashdot? The answer is 5.
>>>>> The
>>>>>> rest of us just write comments without reading... In that spirit, I
>>>>> wanted
>>>>>> to share some thoughts in response to your question, even if I know
>>>>> some of
>>>>>> it will have been said in this thread already :-)
>>>>>>
>>>>>> Basically, I just want to share what has worked well in my past
>>>>> projects...
>>>>>> Visualization: Now that we have Butler running, we can already see a
>>>>>> decline in failing tests for 4.0 and trunk! This shows that
>>>>> contributors
>>>>>> want to do the right thing, we just need the right tools and
>> processes
>>>>> to
>>>>>> achieve success.
>>>>>>
>>>>>> Process: I'm confident we will soon be back to seeing 0 failures for
>>>>> 4.0
>>>>>> and trunk. However, keeping that state requires constant vigilance!
>> At
>>>>>> Mongodb we had a role called Build Baron (aka Build Cop, etc...).
>> This
>>>>> is a
>>>>>> weekly rotating role where the person who is the Build Baron will at
>>>>> least
>>>>>> once per day go through all of the Butler dashboards to catch new
>>>>>> regressions early. We have used the same process also at Datastax to
>>>>> guard
>>>>>> our downstream fork of Cassandra 4.0. It's the responsibility of the
>>>>> Build
>>>>>> Baron to
>>>>>>  - file a jira ticket for new failures
>>>>>>  - determine which commit is responsible for introducing the
>>>>> regression.
>>>>>> Sometimes this is obvious, sometimes this requires "bisecting" by
>>>>> running
>>>>>> more builds e.g. between two nightly builds.
>>>>>>  - assign the jira ticket to the author of the commit that introduced
>>>>> the
>>>>>> regression
>>>>>>
>>>>>> Given that Cassandra is a community that includes part time and
>>>>> volunteer
>>>>>> developers, we may want to try some variation of this, such as
>> pairing
>>>>> 2
>>>>>> build barons each week?
>>>>>>
>>>>>> Reverting: A policy that the commit causing the regression is
>>>>> automatically
>>>>>> reverted can be scary. It takes courage to be the junior test
>> engineer
>>>>> who
>>>>>> reverts yesterday's commit from the founder and CTO, just to give an
>>>>>> example... Yet this is the most efficient way to keep the build
>> green.
>>>>> And
>>>>>> it turns out it's not that much additional work for the original
>>>>> author to
>>>>>> fix the issue and then re-merge the patch.
>>>>>>
>>>>>> Merge-train: For any project with more than 1 commit per day, it will
>>>>>> inevitably happen that you need to rebase a PR before merging, and
>>>>> even if
>>>>>> it passed all tests before, after rebase it won't. In the downstream
>>>>>> Cassandra fork previously mentioned, we have tried to enable a github
>>>>> rule
>>>>>> which requires a) that all tests passed before merging, and b) the PR
>>>>> is
>>>>>> against the head of the branch merged into, and c) the tests were run
>>>>> after
>>>>>> such rebase. Unfortunately this leads to infinite loops where a large
>>>>> PR
>>>>>> may never be able to commit because it has to be rebased again and
>>>>> again
>>>>>> when smaller PRs can merge faster. The solution to this problem is to
>>>>> have
>>>>>> an automated process for the rebase-test-merge cycle. Gitlab supports
>>>>> such
>>>>>> a feature and calls it merge-trean:
>>>>>> https://docs.gitlab.com/ee/ci/pipelines/merge_trains.html
>>>>>>
>>>>>> The merge-train can be considered an advanced feature and we can
>>>>> return to
>>>>>> it later. The other points should be sufficient to keep a reasonably
>>>>> green
>>>>>> trunk.
>>>>>>
>>>>>> I guess the major area where we can improve daily test coverage would
>>>>> be
>>>>>> performance tests. To that end we recently open sourced a nice tool
>>>>> that
>>>>>> can algorithmically detects performance regressions in a timeseries
>>>>> history
>>>>>> of benchmark results: https://github.com/datastax-labs/hunter Just
>>>>> like
>>>>>> with correctness testing it's my experience that catching regressions
>>>>> the
>>>>>> day they happened is much better than trying to do it at beta or rc
>>>>> time.
>>>>>> Piotr also blogged about Hunter when it was released:
>>>>>>
>>>>>>
>> https://medium.com/building-the-open-data-stack/detecting-performance-regressions-with-datastax-hunter-c22dc444aea4
>>>>>> henrik
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Oct 30, 2021 at 4:00 PM Joshua McKenzie <
>> jmcken...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> We as a project have gone back and forth on the topic of quality
>> and
>>>>> the
>>>>>>> notion of a releasable trunk for quite a few years. If people are
>>>>>>> interested, I'd like to rekindle this discussion a bit and see if
>>>>> we're
>>>>>>> happy with where we are as a project or if we think there's steps
>> we
>>>>>> should
>>>>>>> take to change the quality bar going forward. The following
>> questions
>>>>>> have
>>>>>>> been rattling around for me for awhile:
>>>>>>>
>>>>>>> 1. How do we define what "releasable trunk" means? All reviewed by
>> M
>>>>>>> committers? Passing N% of tests? Passing all tests plus some other
>>>>>> metrics
>>>>>>> (manual testing, raising the number of reviewers, test coverage,
>>>>> usage in
>>>>>>> dev or QA environments, etc)? Something else entirely?
>>>>>>>
>>>>>>> 2. With a definition settled upon in #1, what steps, if any, do we
>>>>> need
>>>>>> to
>>>>>>> take to get from where we are to having *and keeping* that
>> releasable
>>>>>>> trunk? Anything to codify there?
>>>>>>>
>>>>>>> 3. What are the benefits of having a releasable trunk as defined
>>>>> here?
>>>>>> What
>>>>>>> are the costs? Is it worth pursuing? What are the alternatives (for
>>>>>>> instance: a freeze before a release + stabilization focus by the
>>>>>> community
>>>>>>> i.e. 4.0 push or the tock in tick-tock)?
>>>>>>>
>>>>>>> Given the large volumes of work coming down the pike with CEP's,
>> this
>>>>>> seems
>>>>>>> like a good time to at least check in on this topic as a community.
>>>>>>>
>>>>>>> Full disclosure: running face-first into 60+ failing tests on trunk
>>>>> when
>>>>>>> going through the commit process for denylisting this week brought
>>>>> this
>>>>>>> topic back up for me (reminds me of when I went to merge CDC back
>> in
>>>>> 3.6
>>>>>>> and those test failures riled me up... I sense a pattern ;))
>>>>>>>
>>>>>>> Looking forward to hearing what people think.
>>>>>>>
>>>>>>> ~Josh
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Henrik Ingo
>>>>>>
>>>>>> +358 40 569 7354 <358405697354>
>>>>>>
>>>>>> [image: Visit us online.] <https://www.datastax.com/>  [image: Visit
>>>>> us on
>>>>>> Twitter.] <https://twitter.com/DataStaxEng>  [image: Visit us on
>>>>> YouTube.]
>>>>>> <
>>>>>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_channel_UCqA6zOSMpQ55vvguq4Y0jAg&d=DwMFaQ&c=adz96Xi0w1RHqtPMowiL2g&r=IFj3MdIKYLLXIUhYdUGB0cTzTlxyCb7_VUmICBaYilU&m=bmIfaie9O3fWJAu6lESvWj3HajV4VFwgwgVuKmxKZmE&s=16sY48_kvIb7sRQORknZrr3V8iLTfemFKbMVNZhdwgw&e=
>>>>>>   [image: Visit my LinkedIn profile.] <
>>>>> https://www.linkedin.com/in/heingo/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] Releasable trunk and quality

Reply via email to