Re: [DISCUSS] Revising our release criteria, commit guidelines, and the role of circleci vs. ASF CI

Josh McKenzie Mon, 03 Oct 2022 07:25:32 -0700

Any further revisions or objections to this or are we good to take it to a vote?


On Wed, Sep 28, 2022, at 10:54 AM, Josh McKenzie wrote:
> So revised proposal:
> 
> On Release Lifecycle cwiki page:
>  - Ensure we have parity on jobs run between circle and asf-ci
>  - Allow usage of circleci as gatekeeper for releases. 1 green run -> beta, 3 
> green runs consecutive -> ga
>  - No new consistent regressions on CI for asf compared to prior branches
>  - Explicitly do not consider ci-cassandra asf flaky tests as release blockers
> 
> Changes to codify into documentation:
>  - On patch before commit, multiplex @500 all new tests, changed tests, or 
> expected to be impacted tests ("expected to be impacted" piece pending 
> multi-class multiplexing support):
>  - Add support for multi-class specification in multiplexer and document
> 
> Add informal project commitment during next major release lifecycle to 
> continue working on bringing asf ci-cassandra up to where it can be formal 
> gatekeeper for release.
> 
> On Wed, Sep 28, 2022, at 10:13 AM, Ekaterina Dimitrova wrote:
>> If we talk blockers nothing more than ensuring we see all tests we want 
>> pre-release, IMHO. 
>> The other points sound to me like future important improvements that will 
>> help us significantly in the flaky test fight.
>> 
>> On Wed, 28 Sep 2022 at 10:08, Josh McKenzie <jmcken...@apache.org> wrote:
>>> __
>>> I'm receptive to that but I wouldn't gate our ability to get 4.1 out the 
>>> door based on circle on that. Honestly probably only need to have the 
>>> parity of coverage be the blocker for its use in retrospect.
>>> 
>>> On Wed, Sep 28, 2022, at 1:32 AM, Berenguer Blasi wrote:
>>>> I would add an option for generate.sh to detect all changed *Test.java 
>>>> files, that would be handy imo.
>>>> 
>>>> On 28/9/22 4:29, Josh McKenzie wrote:
>>>>> So:
>>>>>  1. 500 iterations on multiplexer
>>>>>  2. Augmenting generate.sh to allow providing multiple class names and 
>>>>> generating a single config that'll multiplex all the tests provided
>>>>>  3. Test parity / pre-release config added on circleci (see 
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-17930), specifically 
>>>>> dtest-large, dtest-offheap, test-large-novnode
>>>>> If we get the above 3, are we at a place where we're good to consider 
>>>>> vetting releases on circleci for beta / rc / ga?
>>>>> 
>>>>> On Tue, Sep 27, 2022, at 11:28 AM, Ekaterina Dimitrova wrote:
>>>>>>> “I have plans on modifying the multiplexer to allow specifying a list 
>>>>>>> of classes per test target, so we don't have to needlessly suffer with 
>>>>>>> this”
>>>>>>> 
>>>>>>> 
>>>>>>> That would be great, I was thinking of that the other day too. With 
>>>>>>> that said I’ll be happy to support you in that effort too :-) 
>>>>>> 
>>>>>> On Tue, 27 Sep 2022 at 11:18, Josh McKenzie <jmcken...@apache.org> wrote:
>>>>>>> 
>>>>>>>> I have plans on modifying the multiplexer to allow specifying a list 
>>>>>>>> of classes per test target, so we don't have to needlessly suffer with 
>>>>>>>> this
>>>>>>> This sounds integral to us multiplexing tests on large diffs whether we 
>>>>>>> go with circle for releases or not and would be a great addition!
>>>>>>> 
>>>>>>> On Tue, Sep 27, 2022, at 6:19 AM, Andrés de la Peña wrote:
>>>>>>>>> 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>> 
>>>>>>>> I agree that 500 iterations would be a reasonable minimum. We have 
>>>>>>>> seen flaky unit tests requiring far more iterations, but that's not 
>>>>>>>> very common. We could use to 500 iterations as default, and 
>>>>>>>> discretionary use a higher limit in tests that are quick and might be 
>>>>>>>> prone to concurrency issues. I can change the defaults on CirceCI 
>>>>>>>> config file if we agree to a new limit, the current default of 100 
>>>>>>>> iterations is quite arbitrary.
>>>>>>>> 
>>>>>>>> The test multiplexer allows to either run test individual test methods 
>>>>>>>> or entire classes. It is quite frequent to see tests methods that pass 
>>>>>>>> individually but fail when they are run together with the other tests 
>>>>>>>> in the same class. Because of this, I think that we should always run 
>>>>>>>> entire classes when repeating new or modified tests. The only 
>>>>>>>> exception to this would be Python dtests, which usually are more 
>>>>>>>> resource intensive and not so prone to that type of issues.
>>>>>>>> 
>>>>>>>>> For CI on a patch, run the pre-commit suite and also run multiplexer 
>>>>>>>>> with 250 runs on new, changed, or related tests to ensure not flaky
>>>>>>>> 
>>>>>>>> The multiplexer only allows to run a single test class per push. This 
>>>>>>>> is ok for fixing existing flakies (its original purpose), and for most 
>>>>>>>> minor changes, but it can be quite inconvenient for testing large 
>>>>>>>> patches that add or modify many tests. For example, the patch for 
>>>>>>>> CEP-19 directly modifies 31 test classes, which means 31 CircleCI 
>>>>>>>> config pushes. This number can be somewhat reduced with some wildcards 
>>>>>>>> on the class names, but the process is still quite inconvenient. I 
>>>>>>>> guess that other large patches will find the same problem. I have 
>>>>>>>> plans on modifying the multiplexer to allow specifying a list of 
>>>>>>>> classes per test target, so we don't have to needlessly suffer with 
>>>>>>>> this.
>>>>>>>> 
>>>>>>>> On Mon, 26 Sept 2022 at 22:44, Brandon Williams <dri...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>>> On Mon, Sep 26, 2022 at 1:31 PM Josh McKenzie <jmcken...@apache.org> 
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > 250 iterations isn't enough; I use 500 as a low water mark.
>>>>>>>>> >
>>>>>>>>> > Say more here. I originally had it at 500 but neither Mick nor I 
>>>>>>>>> > knew why and figured we could suss this out on this thread.
>>>>>>>>> 
>>>>>>>>> I've seen flakies that passed with less later exhibit at that point.
>>>>>>>>> 
>>>>>>>>> > This is also assuming that circle and ASF CI run the same tests, 
>>>>>>>>> > which
>>>>>>>>> > is not entirely true.
>>>>>>>>> >
>>>>>>>>> > +1: we need to fix this. My intuition is the path to getting 
>>>>>>>>> > circle-ci in parity on coverage is a shorter path than getting ASF 
>>>>>>>>> > CI to 3 green runs for GA. That consistent w/your perception as 
>>>>>>>>> > well or do you disagree?
>>>>>>>>> 
>>>>>>>>> I agree that bringing parity to the coverage will be the shorter path.
>>>>>>> 
>>>>> 
>>> 
>

Re: [DISCUSS] Revising our release criteria, commit guidelines, and the role of circleci vs. ASF CI

Reply via email to