Thanks for bringing this topic up Botond. The total pushes includes try as well, so that is slightly misleading. We do not add annotations from try typically (there are some cases where that is done, but not the standard role of the sheriffs).
Here you can see the pushes we have over time: https://sql.telemetry.mozilla.org/dashboard/firefox-ci looking at a monthly level, I see these numbers for (autoland, inbound, central): July 2018 - 2049 June 2019 - 2376 So we have increased 15%. Keep in mind that our tests on autoland don't run on every push, many tests run once every 5th push, which means that you could look at 40% of these numbers are a more realistic number of runs for a given number. In addition many of these tests fail on one of our 89 different configs that tests can run within, often a failure happens primarily on a small number of configs at a time. Getting accurate data on this is hard as we do a great job of tracking all failures, but not all passing instances. Given this, are there additional questions or thoughts on what we should use as criteria for disabling tests? On Mon, Jul 8, 2019 at 12:08 PM Botond Ballo <bba...@mozilla.com> wrote: > Hi folks, > > We have a policy of disabling intermittently failing tests is they > fail more than 150 times over 21 days [1] (revised from 200 times over > 30 days [2], which was a very similar failure rate just evaluated over > a slightly longer period). > > When the policy was originally put in place in September 2017, on a > typical week we'd have between 800 and 1000 pushes [3], so the policy > meant disabling tests if they fail at a rate of roughly 5% (50 > failures/week out of 1000 pushes/week). > > These days, on a typical week we have between 4000 and 5000 pushes > [4]. The threshold is the same, so we're now disabling tests if they > fail at a rate of roughly 1% (50 failures/week out of 5000 > pushes/week). > > From an engineering point of view, keeping tests passing at a failure > rate of below 1% is a much more significant challenge than keeping > them passing at a failure rate of below 5% (since failures that are > very infrequent are very time-consuming to reproduce and iterate on). > > Should we perhaps be revising our disablement threshold to keep pace > with the number of pushes per week? > > Thanks, > Botond > > [1] > https://groups.google.com/d/topic/mozilla.dev.platform/346SQCu0NAM/discussion > [2] > https://groups.google.com/d/topic/mozilla.dev.platform/uJVTekj2l7I/discussion > [3] https://bugzilla.mozilla.org/show_bug.cgi?id=1340667#c14 > [4] https://bugzilla.mozilla.org/show_bug.cgi?id=1476893#c37 > _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform