On 25 April 2013 16:11:59, Justin Lebar wrote:
Which specific proposals should we start with? As you say, there are dozens
of ideas out there, none with any kind of consensus behind them.

If a preponderance of options is actually the only thing standing in
the way of serious and timely work being done by releng here, I would
be more than happy to assist you guys in narrowing things down to
fewer choices.

On Thu, Apr 25, 2013 at 10:27 AM, Chris AtLee <cat...@mozilla.com> wrote:
On 01:48, Thu, 25 Apr, Justin Lebar wrote:

One idea might be to give developers feedback on the consequences of a
particular push, e.g. the AWS cost, a proxy for "time during which
developers couldn't push" or some other measurable metric.  Right now
each push probably "feels" as "expensive" as every other.


For tryserver, I proposed bug 848589 to do just this.  I think it's
worth trying, but someone needs to implement it.

Nobody's blaming the user.  We should just empower them to make better
choices.


Okay.

I guess what's frustrating to me is that we have this problem and
essentially our only option to solve it is to change users' behavior.
I totally believe that some people could use resources much more
efficiently, but it's frustrating if changing user behavior is our
only tool.

We keep talking about this every few weeks, as though there's some
hidden solution that will emerge only after ten newsgroup threads.  In
actuality, we very likely will need to do a bunch of different things,
each having a small impact.  And in particular, I don't think we'll
solve this problem without significant work from release engineering.
If that work isn't forthcoming, I don't think we're going to make a
significant dent in this.


I would say without significant work from *everyone*. There's only so much
releng can do here.

Which specific proposals should we start with? As you say, there are dozens
of ideas out there, none with any kind of consensus behind them.
We're already adding capacity as quickly as we can, and as Ehsan implied, I
don't think we're ever going to have enough capacity. We're always adding
more, but developer activity and load generated per developer has always
increased faster than we have been able to add more capacity. I can't see
that changing anytime soon. New mobile test platforms are particularly
challenging to bring online, as anybody from releng/ateam or IT will tell
you.

As we work on adding more capacity, we also need to look at how to be
smarter with our existing resources, which comes down to doing less work
over all, or being able to smooth out the peaks in our load without
impacting developers.

I would *love* some help from developers help make tests run more
efficiently. At last week's infra load meeting, gps mentioned that there are
plenty of tests that could be parallelized. There are also bugs like
https://bugzilla.mozilla.org/show_bug.cgi?id=864085 where windows debug
browser chrome test times have been steadily increasing over the past year.
Can we find other cases like this?

Also, there are plenty of jobs that are run per push, but are hidden on
inbound and other branches. I would love to be able to disable those from
running per push and save everybody from paying the cost of running them.

Cheers,
Chris
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Something that strikes me as very obvious that can be done to reduce load on try, is to allow for jobs to be requested and cancelled in a more granular fashion. Right now, I have to think before I push "What's the most I could possibly need?" And if I don't request enough, I have to push an entire new job!

I know that I'd request a lot less from try, and request fewer jobs, if I could, after I've pushed, trigger/cancel builds per platform, and request/cancel particular tests.

--Chris
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to