[reformatted and infra tag added]
On Tue, Aug 11, 2015 at 07:32:34PM EDT, Salvatore Orlando wrote: > On 12 August 2015 at 00:21, Sean M. Collins <s...@coreitpro.com> wrote: > > > Hello, > > > > Today has been an exciting day, to say the least. Earlier today I was > > pinged on IRC about some firewall as a service unit test failures that > > were blocking patches from being merged, such as > > https://review.openstack.org/#/c/211537/. > > > > Neutron devs started poking around a bit and discussing on the IRC channel. > > > > > > http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2015-08-11.log.html#t2015-08-11T16:59:13 > > > > I've started to dig a little bit and document what I've found on this > > bug. > > > > https://bugs.launchpad.net/neutron/+bug/1483875 > > > > There was a change recently merged in devstack-gate which changes the > > MySQL database driver and the number of workers - > > https://review.openstack.org/#/c/210649/ > > which might be what is triggering the race condition - but I'm honestly > > not sure. > > > > I proposed a revert to a section of the FwaaS code, but frankly I'm not > > sure if this will fix the problem - https://review.openstack.org/211677 > > - so I bumped it out of the merge queue when my anxiety reached maximum. > > I'm just not confident enough about my knowledge of the FwaaS codebase > > to really be making these kinds of changes. > > > > Is there anyone that has any insights? > > > > > > -- > > Sean M. Collins > > > > > > I have been hit by these failures as well. > I think you did well by bumping out that revert from the queue; I think it > simply cures the sympton possibly affecting correct operations of the > firewall service. > If we are looking at removing the sympton on the API job, than I'd skip the > failing tests while somebody figures out what's going on (unless the team > decides that it is better to revert again multiple workers). > > However, I think the issue might not be limited at firewall. I've seen a > worrying spike in rally failures [1]. Since it's non-voting probably > developers do not care a lot about it, but it provides very useful > insights. I am looking at rally logs now - at the moment I have not yet a > clear idea of the root cause of such failures. Ihar pushed a revert of the DevStack gate job[1], maybe infra can weigh in on that - otherwise if it makes everyone happier I can just set the test to skip for the time being to unblock everyone. I'll then do my research I've been meaning to do into xfail[2] so we can continue running tests and capturing data, but not making a job fail because of a test or race condition we're aware of. [1]: https://review.openstack.org/#/c/211853/ [2]: http://pytest.org/latest/skipping.html -- Sean M. Collins __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev