David, Thanks for the feedback. We know we have more work to do on our integration gate. It is a matter of finding people that have been trained on gating development to do gate work.
Regards -steve On 7/4/16, 12:39 PM, "David Moreau Simard" <d...@redhat.com> wrote: >I mentioned this on IRC to some extent but I'm going to post it here >for posterity. > >I think we can all agree that Integration tests are pretty darn >important and I'm convinced I don't need to remind you why. >I'm going to re-iterate that I am very concerned about the state of >the jobs but also their coverage. > >Kolla provides an implementation for a lot of the big tents projects >but they are not properly (if at all) tested in the gate. >Only the core services are tested in an "all-in-one" fashion and if a >commit happens to break a project that isn't tested in that all-in-one >test, no one will know about it. > >This is very dangerous territory -- you can't guarantee that what >Kolla supports really works on every commit. >Both Packstack [1] and Puppet-OpenStack [2] have an extensive matrix >of test coverage across different jobs and different operating systems >to work around the memory constraints of the gate virtual machines. >They test themselves with their project implementations in different >ways (i.e, glance with file, glance with swift, cinder with lvm, >cinder with ceph, neutron with ovs, neutron with linuxbridge, etc.) >and do so successfully. > >I don't see why Kolla should be different if it is to be taken seriously. >My apologies if it feels I am being harsh - I am being open and honest >about Kolla's loss of credibility from my perspective. > >I've put my attempts to put Kolla in RDO's testing pipeline on hold >for the Newton cycle. >I hope we can straighten out all of this -- I care about Kolla and I >want it to succeed, which is why I started this thread in the first >place. > >While I don't really have the bandwidth to contribute to Kolla, I hope >you can at least consider my feedback and you can also find me on IRC >if you have questions. > >[1]: https://github.com/openstack/packstack#packstack-integration-tests >[2]: https://github.com/openstack/puppet-openstack-integration#description > >David Moreau Simard >Senior Software Engineer | Openstack RDO > >dmsimard = [irc, github, twitter] > > >On Thu, Jun 16, 2016 at 8:20 AM, Steven Dake (stdake) <std...@cisco.com> >wrote: >> David, >> >> The gates are unreliable for a variety of reasons - some we can fix - >>some >> we can't directly. >> >> RDO rabbitmq introduced IPv6 support to erlang, which caused our gate >> reliably to drop dramatically. Prior to this change, our gate was >>running >> 95% reliability or better - assuming the code wasn¹t busted. >> The gate gear is different - meaning different setup. We have been >> working on debugging all these various gate provider issues with infra >> team and I think that is mostly concluded. >> The gate changed to something called bindeps which has been less >>reliable >> for us. >> We do not have mirrors of CentOS repos - although it is in the works. >> Mirrors will ensure that images always get built. At the moment many of >> the gate failures are triggered by build failures (the mirrors are too >> busy). >> We do not have mirrors of the other 5-10 repos and files we use. This >> causes more build failures. >> >> Complicating matters, any of theses 5 things above can crater one gate >>job >> of which we run about 15 jobs, which causes the entire gate to fail (if >> they were voting). I really want a voting gate for kolla's jobs. I >>super >> want it. The reason we can't make the gates voting at this time is >> because of the sheer unreliability of the gate. >> >> If anyone is up for a thorough analysis of *why* the gates are failing, >> that would help us fix them. >> >> Regards >> -steve >> >> On 6/15/16, 3:27 AM, "Paul Bourke" <paul.bou...@oracle.com> wrote: >> >>>Hi David, >>> >>>I agree with this completely. Gates continue to be a problem for Kolla, >>>reasons why have been discussed in the past but at least for me it's not >>>clear what the key issues are. >>> >>>I've added this item to agenda for todays IRC meeting (16:00 UTC - >>>https://wiki.openstack.org/wiki/Meetings/Kolla). It may help if before >>>hand we can brainstorm a list of the most common problems here >>>beforehand. >>> >>>To kick things off, rabbitmq seems to cause a disproportionate amount of >>>issues, and the problems are difficult to diagnose, particularly when >>>the only way to debug is to summit "DO NOT MERGE" patch sets over and >>>over. Here's an example of a failed centos binary gate from a simple >>>patch set I was reviewing this morning: >>>http://logs.openstack.org/06/329506/1/check/gate-kolla-dsvm-deploy-cento >>>s- >>>binary/3486d03/console.html#_2016-06-14_15_36_19_425413 >>> >>>Cheers, >>>-Paul >>> >>>On 15/06/16 04:26, David Moreau Simard wrote: >>>> Hi Kolla o/ >>>> >>>> I'm writing to you because I'm concerned. >>>> >>>> In case you didn't already know, the RDO community collaborates with >>>> upstream deployment and installation projects to test it's packaging. >>>> >>>> This relationship is beneficial in a lot of ways for both parties, in >>>>summary: >>>> - RDO has improved test coverage (because it's otherwise hard to test >>>> different ways of installing, configuring and deploying OpenStack by >>>> ourselves) >>>> - The RDO community works with upstream projects (deployment or core >>>> projects) to fix issues that we find >>>> - In return, the collaborating deployment project can feel more >>>> confident that the RDO packages it consumes have already been tested >>>> using it's platform and should work >>>> >>>> To make a long story short, we do this with a project called WeIRDO >>>> [1] which essentially runs gate jobs outside of the gate. >>>> >>>> I tried to get Kolla in our testing pipeline during the Mitaka cycle. >>>> I really did. >>>> I contributed the necessary features I needed in Kolla in order to >>>> make this work, like the configurable Yum repositories for example. >>>> >>>> However, in the end, I had to put off the initiative because the gate >>>> jobs were very flappy and unreliable. >>>> We cannot afford to have a job that is *expected* to flap in our >>>> testing pipeline, it leads to a lot of wasted time, effort and >>>> resources. >>>> >>>> I think there's been a lot of improvements since my last attempt but >>>> to get a sample of data, I looked at ~30 recently merged reviews. >>>> Of 260 total build/deploy jobs, 55 (or over 20%) failed -- and I >>>> didn't account for rechecks, just the last known status of the check >>>> jobs. >>>> I put up the results of those jobs here [2]. >>>> >>>> In the case that interests me most, CentOS binary jobs, it's 5 >>>> failures out of 50 jobs, so 10%. Not as bad but still a concern for >>>> me. >>>> >>>> Other deployment projects like Puppet-OpenStack, OpenStack Ansible, >>>> Packstack and TripleO have quite a bit of *voting* integration testing >>>> jobs. >>>> Why are Kolla's jobs non-voting and so unreliable ? >>>> >>>> Thanks, >>>> >>>> [1]: https://github.com/rdo-infra/weirdo >>>> [2]: >>>>https://docs.google.com/spreadsheets/d/1NYyMIDaUnlOD2wWuioAEOhjeVmZe7Q8 >>>>_z >>>>dFfuLjquG4/edit#gid=0 >>>> >>>> David Moreau Simard >>>> Senior Software Engineer | Openstack RDO >>>> >>>> dmsimard = [irc, github, twitter] >>>> >>>> >>>>_______________________________________________________________________ >>>>__ >>>>_ >>>> OpenStack Development Mailing List (not for usage questions) >>>> Unsubscribe: >>>>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >>> >>>________________________________________________________________________ >>>__ >>>OpenStack Development Mailing List (not for usage questions) >>>Unsubscribe: >>>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >>_________________________________________________________________________ >>_ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >__________________________________________________________________________ >OpenStack Development Mailing List (not for usage questions) >Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev