On Friday 2014-04-04 11:58 -0700, jmaher wrote:
> As the sheriff's know it is frustrating to deal with hundreds of tests that 
> fail on a daily basis, but are intermittent.
> 
> When a single test case is identified to be leaking or failing at least 10% 
> of the time, it is time to escalate.
> 
> Escalation path:
> 1) Ensure we have a bug on file, with the test author, reviewer, module 
> owner, and any other interested parties, links to logs, etc.
> 2) We need to needinfo? and expect a response within 2 business days, this 
> should be clear in a comment.
> 3) In the case we don't get a response, request a needinfo? from the module 
> owner
> with the expectation of 2 days for a response and getting someone to take 
> action.
> 4) In the case we go another 2 days with no response from a module owner, we 
> will disable the test.

Are you talking about newly-added tests, or tests that have been
passing for a long time and recently started failing?

In the latter case, the burden should fall on the regressing patch,
and the regressing patch should get backed out instead of disabling
the test.

> Ideally we will work with the test author to either get the test fixed or 
> disabled depending on available time or difficulty in fixing the test.
> 
> This is intended to respect the time of the original test authors by not 
> throwing emergencies in their lap, but also strike a balance with keeping the 
> trees manageable. 

If this plan is applied to existing tests, then it will lead to
style system mochitests being turned off due to other regressions
because I'm the person who wrote them and the module owner, and I
don't always have time to deal with regressions in other parts of
code (e.g., the JS engine) leading to these tests failing
intermittently.

If that happens, we won't have the test coverage we need to add new
CSS properties or values.

More generally, it places a much heavier burden on contributors who
have been part of the project longer, who are also likely to be
overburdened in other ways (e.g., reviews).  That's why the burden
needs to be placed on the regressing change rather than the original
author of the test.

> Two exceptions:
> 1) If a test is failing at least 50% of the time, we will file a bug and 
> disable the test first

These 10% and 50% numbers don't feel right to me; I think the
thresholds should probably be substantially lower.  But I think it's
easier to think about these numbers in failures/day, at least for
me.

> 2) When we are bringing a new platform online (Android 2.3, b2g, etc.) many 
> tests will need to be disabled prior to getting the tests on tbpl.

That's reasonable as long as work is done to try to get the tests
enabled (at a minimum, actually enabling all the tests that are
passing reliably, rather than stopping after enabling the passing
tests in only some directories).

-David

-- 
𝄞   L. David Baron                         http://dbaron.org/   𝄂
𝄢   Mozilla                          https://www.mozilla.org/   𝄂
             Before I built a wall I'd ask to know
             What I was walling in or walling out,
             And to whom I was like to give offense.
               - Robert Frost, Mending Wall (1914)

Attachment: signature.asc
Description: Digital signature

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to