On 07/04/14 05:10 AM, James Graham wrote:
On 07/04/14 04:33, Andrew Halberstadt wrote:
On 06/04/14 08:59 AM, Aryeh Gregor wrote:
Is there any reason in principle that we couldn't have the test runner
automatically rerun tests with known intermittent failures a few
times, and let the test pass if it passes a few times in a row after
the first fail?  This would be a much nicer option than disabling the
test entirely, and would still mean the test is mostly effective,
particularly if only specific failure messages are allowed to be
auto-retried.

Many of our test runners have that ability. But doing this implies that
intermittents are always the fault of the test. We'd be missing whole
classes of regressions (notably race conditions).

In practice how effective are we at identifying bugs that lead to
instability? Is it more common that we end up disabling the test, or
marking it as "known intermittent" and learning to live with the
instability, both of which options reduce the test coverage, or is it
more common that we realise that there is a code reason for the
intermittent, and get it fixed?

I would guess the former is true in most cases. But at least there we have a *chance* at tracking down and fixing the failure, even if it takes awhile before it becomes annoying enough to prioritize. If we made it so intermittents never annoyed anyone, there would be even less motivation to fix them. Yes in theory we would still have a list of top failing intermittents. In practice that list will be ignored.

Case in point, desktop xpcshell does this right now. Open a log and ctrl-f for "Retrying tests". Most runs have a few failures that got retried. No one knows about these and no one looks at them. Publishing results somewhere easy to discover would definitely help, but I'm not convinced it will help much.

Doing this would also cause us to miss non-intermittent regressions, e.g where the ordering of tests tickles the platform the wrong way. On the retry, the test would get run in a completely different order and might show up green 100% of the time.

Either way, the problem is partly culture, partly due to not good enough tooling. I see where this proposal is coming from, but I think there are ways of tackling the problem head on. This seems kind of like a last resort.

Andrew
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to