Oh cool. I didn't realize it was deliberately limited already. I had assumed it was just hitting the resource limits for that queue.
So it looks like it's around 20 now. However, I would argue that shortening it more would help get patches through the gate. For the sake of discussion, let's assume there is a 80% chance of success in one test run on a patch. So a given patch's probability of success is .8^n where n is the number of runs. For the 1st patch in the queue, n is just one. For the 2nd patch, n is 1 + the probability of a failure from patch 1. For the 3rd patch, n is 1 + the probability of a failure in patch 2 or 1. For the 4th patch, n is 1 + the probability of a failure in patch 3, 2, or 1. ... Unfortunately my conditional probability skills are too shaky to trust an equation I come up with to represent the above scenario so I wrote a gate failure simulator [1]. At a queue size of 20 and an 80% success rate. The patch in position 20 only has a ~44% chance of getting merged. However, with a queue size of 4, the patch in position 4 has a ~71% chance of getting merged. You can try the simulator out yourself with various numbers. Maybe the odds of success are much better than 80% in one run and my point is moot, but I have several patches waiting to be merged that haven't made it through after ~3 tries each. Cheers, Kevin Benton 1. http://paste.openstack.org/show/83039/ On Thu, Jun 5, 2014 at 4:04 PM, Joe Gordon <joe.gord...@gmail.com> wrote: > > > > On Thu, Jun 5, 2014 at 3:29 PM, Kevin Benton <blak...@gmail.com> wrote: > >> Is it possible to make the depth of patches running tests in the gate >> very shallow during this high-probability of failure time? e.g. Allow only >> the top 4 to run tests and put the rest in the 'queued' state. Otherwise >> the already elevated probability of a patch failing is exacerbated by the >> fact that it gets retested every time a patch ahead of it in the queue >> fails. >> >> Such a good idea that we already do it. > > http://status.openstack.org/zuul/ > > The grey circles refer to patches that are in the queued state. But this > only helps us from hitting resource starvation but doesn't help us get > patches through the gate. We haven't been landing many patches this week > [0] > > [0] https://github.com/openstack/openstack/graphs/commit-activity > > >> -- >> Kevin Benton >> >> >> On Thu, Jun 5, 2014 at 5:07 AM, Sean Dague <s...@dague.net> wrote: >> >>> You may all have noticed things are really backed up in the gate right >>> now, and you would be correct. (Top of gate is about 30 hrs, but if you >>> do the math on ingress / egress rates the gate is probably really double >>> that in transit time right now). >>> >>> We've hit another threshold where there are so many really small races >>> in the gate that they are compounding to the point where fixing one is >>> often failed by another one killing your job. This whole situation was >>> exacerbated by the fact that while the transition from HP cloud 1.0 -> >>> 1.1 was happening and we were under capacity, the check queue grew to >>> 500 with lots of stuff being approved. >>> >>> That flush all hit the gate at once. But it also means that those jobs >>> passed in a very specific timing situation, which is different on the >>> new HP cloud nodes. And the normal statistical distribution of some jobs >>> on RAX and some on HP that shake out different races didn't happen. >>> >>> At this point we could really use help getting focus on only recheck >>> bugs. The current list of bugs is here: >>> http://status.openstack.org/elastic-recheck/ >>> >>> Also our categorization rate is only 75% so there are probably at least >>> 2 critical bugs we don't even know about yet hiding in the failures. >>> Helping categorize here - >>> http://status.openstack.org/elastic-recheck/data/uncategorized.html >>> would be handy. >>> >>> We're coordinating changes via an etherpad here - >>> https://etherpad.openstack.org/p/gatetriage-june2014 >>> >>> If you want to help, jumping in #openstack-infra would be the place to >>> go. >>> >>> -Sean >>> >>> -- >>> Sean Dague >>> http://dague.net >>> >>> >>> _______________________________________________ >>> OpenStack-dev mailing list >>> OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >> >> >> -- >> Kevin Benton >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > -- Kevin Benton
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev