Re: try: -p all considered harmful?

Chris AtLee Sat, 29 Sep 2012 14:35:33 -0700

On 29/09/12 04:14 PM, Justin Lebar wrote:

One proposal that's been made elsewhere 
(https://bugzilla.mozilla.org/show_bug.cgi?id=791385) is to have a soft limit 
of one active push per developer on try. If you try and push a 2nd time before
your previous jobs are all finished, you will be asked to cancel your previous 
jobs. There would be some kind of manual override that would allow you to push 
additional patches.


I think this would likely be much less impactful than than bholley's
proposed -p any, since in the common circumstance where I push to try,
notice it's going permaorange on all platforms, and then want to
cancel all remaining builds/tests, I've already wasted a lot of
resources which would have been saved by -p any.

That's not to say it's not an interesting idea; I just hope it gets
prioritized appropriately.

Also, I hope this manual override is not a pain to use.  Pretty please?  :)

The hook attached to the bug requires that you include a short stringtoken in your commit message. The token is generated as a function oftime, your ldap name, and a local secret. Without specifying the tokenthe hook will reject your 2nd push, remind you that you can cancel yourprevious jobs, and give you the token as well as the time at which thetoken expires.

Surface [the leaderboard of try abusers] on tbpl, clearly visible on the 
inbound pushes. Public shaming ftw.

The intent here is definitely not public shaming. More like publicawareness. I'm in no position to judge if all those pushes are using tryeffectively.

If we're going to hold anyone publicly accountable, I think it should
be the teams which are responsible for ensuring we have enough
resources to run builds and tests.

We're all trying to build the best system we can here. We've beenpublishing as much raw data as we can, as well as reports like wait timedata for ages. We're not trying to hide this stuff away. At the sametime, it's impossible to give any kind of SLA when the build/test loadis unbounded.

We should have a public dashboard showing end-to-end tryserver times
-- starting with a push, how long did it take for all the requested
tests to complete?  And we should surface not only the mean, but
quantiles -- that is, how long were wait times for the 90th percentile
of longest wait times?

I can take another stab at this. However, I'm not sure that try is thebest branch to do this on, since the best-case end-to-end time variesdrastically depending on which platforms/tests were selected, and if theuser opted to cancel or rebuild jobs later.

I understand an intern worked on an approximation of this, but didn't
entirely get there, so his tool hasn't been publicly released.

If the expectation is that developers should be accountable for the
resources they use, I think it's only fair that releng/it be
accountable for the resources they provide.

We've seen that where we don't have tracking -- e.g. for how long it
takes to push to try [1], or basically for anything else at Mozilla --
we often regress the metric we're interested in.  You make what you
measure.  If we want consistently fast try pushes, it's hard to
imagine how we'd get there without public data monitoring exactly the
thing we're interested in.


I'm sure IT would love some help in figuring out how to measure this.

Cheers,
Chris

-Justin

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=691459


_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: try: -p all considered harmful?

Reply via email to