There are known issues with the test infrastructure (e.g. differences in weekend vs weekday results) and those known issues are currently being masked with human judgement. A-Team has investigated these issues, and fixed some of them, but fixing the rest will take a non-trivial amount of effort as I understand it. When there's enough time to fix all the sources of noise in the infrastructure, human judgement will no longer be required.
As an aside, I'm answering the questions for this 48-hour backout announcement, but it's really Joel Maher + William Lachance + Vaibhav Agarwal doing all the heavy lifting related to regression handling. They're working on the regression-detection and regression-investigation tools, and they're the ones acting as perf sheriffs. Avi from my team is helping test the tools, and I just participate in policy discussions and act as an (unintentional) spokesperson :) On Fri, Aug 14, 2015 at 8:49 PM, Martin Thomson <m...@mozilla.com> wrote: > On Fri, Aug 14, 2015 at 3:44 PM, Vladan Djeric <vdje...@mozilla.com> > wrote: > > Is this the ts_paint regression you're referring to? > > > https://groups.google.com/forum/#!searchin/mozilla.dev.tree-alerts/ts_paint/mozilla.dev.tree-alerts/FArVsa8guXg/FfY91JK7AAAJ > > Yeah. I only ask because in exercising judgment suppresses > information about the stability of the tests, so that all we have is > anectodal evidence. That's probably OK here. The process you > describe sounds pretty robust against false positives. > _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform