On 15:10, Tue, 05 Nov, James Graham wrote:
On 05/11/13 14:57, Kyle Huey wrote:On Tue, Nov 5, 2013 at 10:44 PM, David Burns <dbu...@mozilla.com> wrote:We appear to be doing 1 backout for every 15 pushes on a rough average[4]. This number I am sure you can all agree is far too high especially if we think about the figures that John O'Duinn suggests[5] for the cost of each push for running and testing. With the offending patch + backout we are using 508 computing hours for essentially doing no changes to the tree and then we do another 254 computing hours for the fixed reland. Note the that the 508 hours doesn't include retriggers done by the Sheriffs to see if it is intermittent or not. This is a lot of wasted effort when we should be striving to get patches to stick first time. Let's see if we can try make this figure 1 in 30 patches getting backed out.What is your proposal for doing that? What are the costs involved? It isn't very useful to say X is bad, let's not do X, without looking at what it costs to not do X. To give one hypothetical example, if it requires just two additional full try pushes to avoid one backout, we haven't actually saved any computing time.So, as far as I can tell that the heart of the problem is that the end-to-end time for the build+test infrastructure is unworkably slow. I understand that waiting half a dozen hours — a significant fraction of a work day — for a try run is considered normal. This has a huge knock-on effect e.g. it requires people to context switch away from one problem whilst they wait, and context switch back into it once they have the results. Presumably it also encourages landing changes without proper testing, which increases the backout rate. It seems that this will cost a great deal not just in terms of compute hours (which are easy to measure) but also in terms of developer productivity (which is harder to measure, but could be even more significant).Wht data do we currently have about why the wait time is so long? If this data doesn't exist, can we start to collect it? Are there easy wins to be had, or do we need to think about restructuring the way that we do builds and/or testing to achieve greater throughput?
We're publishing data in several places about total run time for jobs.For overall build metrics, you can try http://brasstacks.mozilla.com/gofaster/
For specific revisions you can query self-serve, e.g.https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/5ff9d60c6803, or in json
https://secure.pub.build.mozilla.org/buildapi/self-serve/try/rev/5ff9d60c6803?format=jsonFor historical data, you can look at all our archived build data here: http://builddata.pub.build.mozilla.org/buildjson/
Average times for builds/tests on m-c are published here: https://secure.pub.build.mozilla.org/builddata/reports/reportor/daily/branch_times/output.txt "end-to-end" times for try are here: https://secure.pub.build.mozilla.org/builddata/reports/reportor/daily/end2end_try/end2end.html I hope this helps! Cheers, Chris
signature.asc
Description: Digital signature
_______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform