On 2014-04-30 16:52 , Daniel Holbert wrote:
> On 03/07/2014 02:41 PM, Hal Wine wrote:
>> On 2014-02-28 17:24 , Hal Wine wrote:
>>> tl;dr: what is the balance point between pushes to try taking too long
>>> and loosing repository history of recent try pushes?
>> Based on the responses to this specific question, we'll go back to
>> waiting for developers to notify IT when there is enough performance
>> impact to warrant a reset of the try repository

Thanks for reopening this thread.

>
> As documented on
>  https://bugzilla.mozilla.org/show_bug.cgi?id=994028
> we've now had multiple instances in the past few weeks where Try has
> been horked (refusing all pushes) for hours at a time, with no clear
> reason why.
>
> I'm not sure if this is caused by Try having too many heads & needing a
> reset, but it seems like it could be. (It also could be *indirectly*
> caused by the too-many-heads issue, too; e.g. perhaps someone
> interrupted a push because it was taking too long (due to too many
> heads), and their client inadvertently left something on the server
> locked, which then locks everyone else out for hours.)
>
> Whatever the cause, it's feeling more and more like periodic, automatic
> Try resets would be helpful to keep things running smoothly.

Yes, or a better working definition of "too much performance impact". In
this case, we had a 4h10m gap in the pushlog, and now things are back to
"normal". A reset would take about that long to perform.

>
> Would it be possible to set up a system along the lines of dbaron's
> suggestion earlier in this post? (Frequent resets, with a post-reset
> step to pull in the most recent ~2 weeks worth of heads from the old
> repo, so that people's try pushes don't mysteriously disappear if they
> happen to push right before a reset.)

It is something that could be tried - we'll try a few dry runs to see
how much this adds to the reset try duration (given that we have to pull
those changes from the "slow repo").

I also have some fresh thoughts on https://bugzil.la/691459 - there may
be some log correlation possible to get us hard data on overall success
rates and push times.

Looking forward to getting a newer (and hopefully better) approach to
this recurring issue.

--Hal

_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to