> I think a simple metric for "is something flaky" is "does it only fail once 
> in the butler history (of 15 or so builds)".

Does that make it considered flaky?  What if the one failure is a
timeout?  I think each failing case has to have the failures
investigated in order to know.

Kind Regards,
Brandon

On Thu, Aug 18, 2022 at 9:31 AM Josh McKenzie <jmcken...@apache.org> wrote:
>
> So move to beta when:
>
> all non-flaky test *failures* (NOT tickets, see below) are resolved
> We get a green ci-cassandra run
>
> Move to rc when:
>
> Three consecutive green runs in ci-cassandra
>
> Release when:
>
> All rc tickets are closed
> Some time-based gate maybe?
> Three more consecutive green ci-cassandra runs?
>
>
> We don't have people volunteering for the build lead role so we don't 
> consistently have tickets created for flaky or non-flaky test failures, thus 
> we can't use that as a gatekeeper IMO as it's non-deterministic. Using "no 
> non-flaky failures in butler (i.e. ci-cassandra + history analysis)" should 
> shore that up. We also need a more rigorous designation for flaky vs. 
> non-flaky in our tickets outside an informal practice of adding that to the 
> Summary.
>
> I think a simple metric for "is something flaky" is "does it only fail once 
> in the butler history (of 15 or so builds)".
>
> We can then filter out our kanban to reflect that as well (flaky tests to 
> their own swimlane as they're "iffy" as RC blockers; it'd technically be a 
> roll of the dice as to whether any flake on the 3 consecutive runs we need to 
> get green to release... which I don't love ;) ).
>
> We did something similar last time, this would be the same exception to the 
> rules, rules we continue to get closer to.
>
> If we did something similar last time and this is the same exception to the 
> rules, I don't think we're getting closer to satisfying those rules are we? 
> i.e. I think we should consider revising the rules formally to match the 
> above metrics that are a little fuzzier and more tolerant to the current (and 
> richly historical!) reality of our CI environment.
>
> Would save us a lot of back and forth on subsequent releases. :)
>
> ~Josh
>
> On Thu, Aug 18, 2022, at 1:24 AM, Berenguer Blasi wrote:
>
> +1 to Mick's points.
>
> Also notice in circle 4.1 green runs are the norm lately imo. Yes it's not 
> the official CI but it helps build an overall picture of improvement towards 
> green CI. On jenkins, if you check the latest 4.1 runs, <5-ish failures per 
> run are starting to be common and those that don't are known failures being 
> worked on (CAS i.e.), infra or flakies taking you back to the <5-ish 
> failures. So overall, if I am not missing anything, the signal among the 
> infra and flaky noise is pretty good.
>
> Regards
>
> On 17/8/22 22:50, Ekaterina Dimitrova wrote:
>
> +1, I second Mick on both points.
>
> On Wed, 17 Aug 2022 at 16:23, Mick Semb Wever <m...@apache.org> wrote:
>
> We're down from 13 tickets blocking 4.1 beta down to 7: 
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2455.
>  As mentioned above, we have some test failures w/out tickets so that 7 is 
> probably closer realistically to the previous count.
>
>
>
> I suggest we move to beta when all non-flaky-test tickets are resolved and we 
> get our first green ci-cassandra run.
> And I suggest we move to rc when we get three consecutive green runs.
>
> We did something similar last time, this would be the same exception to the 
> rules, rules we continue to get closer to.
>
> An alternative is to replace "green" with "builds with only non-regression 
> and infra-caused failures".
>
>
>
> - It's pretty expensive and painful to defer cleaning up CI to the end of the 
> release cycle
>
>
>
> This^
>
>

Reply via email to