On Tue, Dec 1, 2015 at 3:22 AM, Steven Hardy <sha...@redhat.com> wrote:
> On Mon, Nov 30, 2015 at 03:35:13PM -0800, Devananda van der Veen wrote: > > On Mon, Nov 30, 2015 at 3:07 PM, Zane Bitter <zbit...@redhat.com> > wrote: > > > > On 30/11/15 12:51, Ruby Loo wrote: > > > > On 30 November 2015 at 10:19, Derek Higgins <der...@redhat.com > > <mailto:der...@redhat.com>> wrote: > > > >   Hi All, > > > >      A few months tripleo switch from its devtest based > CI to > > one > >   that was based on instack. Before doing this we anticipated > >   disruption in the ci jobs and removed them from non tripleo > > projects. > > > >      We'd like to investigate adding it back to heat and > > ironic as > >   these are the two projects where we find our ci provides the > > most > >   value. But we can only do this if the results from the job > are > >   treated as voting. > > > > What does this mean? That the tripleo job could vote and do a -1 > and > > block ironic's gate? > > > >      In the past most of the non tripleo projects tended > to > > ignore > >   the results from the tripleo job as it wasn't unusual for > the > > job to > >   broken for days at a time. The thing is, ignoring the > results of > > the > >   job is the reason (the majority of the time) it was broken > in > > the > >   first place. > >      To decrease the number of breakages we are now no > longer > >   running master code for everything (for the non tripleo > projects > > we > >   bump the versions we use periodically if they are working). > I > >   believe with this model the CI jobs we run have become a lot > > more > >   reliable, there are still breakages but far less frequently. > > > >   What I proposing is we add at least one of our tripleo jobs > back > > to > >   both heat and ironic (and other projects associated with > them > > e.g. > >   clients, ironicinspector etc..), tripleo will switch to > running > >   latest master of those repositories and the cores approving > on > > those > >   projects should wait for a passing CI jobs before hitting > > approve. > >   So how do people feel about doing this? can we give it a > go? A > >   couple of people have already expressed an interest in doing > > this > >   but I'd like to make sure were all in agreement before > switching > > it on. > > > > This seems to indicate that the tripleo jobs are non-voting, or at > > least > > won't block the gate -- so I'm fine with adding tripleo jobs to > > ironic. > > But if you want cores to wait/make sure they pass, then shouldn't > they > > be voting? (Guess I'm a bit confused.) > > > > +1 > > > > I don't think it hurts to turn it on, but tbh I'm uncomfortable > with the > > mental overhead of a non-voting job that I have to manually treat > as a > > voting job. If it's stable enough to make it a voting job, I'd > prefer we > > just make it voting. And if it's not then I'd like to see it be made > > stable enough to be a voting job and then make it voting. > > > > This is roughly where I sit as well -- if it's non-voting, experience > > tells me that it will largely be ignored, and as such, isn't a good > use of > > resources. > > I'm sure you can appreciate it's something of a chicken/egg problem though > - if everyone always ignores non-voting jobs, they never become voting. > > That effect is magnified with TripleO though, because it consumes so many > OpenStack projects, any one of which has the capability to break our CI, so > in an ideal world we'd have voting feedback on all-the-things, but that's > not where we are right now due in large-part to the steady stream of > regressions (from Heat, Ironic and other projects). > > > I haven't looked at tripleo or tripleoci in a while, so I wont assume > that > > my recollection of the CI jobs bears any resemblance to what exists > today. > > Could you explain what areas of ironic (or its subprojects) will be > > covered by these tests? If they are already covered by existing > tests, > > then I don't see the benefit of adding another job; conversely, if > this is > > testing areas we don't cover today, then there's probably value in > running > > tripleoci in a voting fashion for now and then moving that coverage > into > > ironic's project testing. > > I like to think of TripleO as a trunk-chasing "power user", and as such > gives very valuable "user" feedback, including breaking things in exciting > ways you hadn't anticipated in your project integration tests. > > This has, in the case of Heat at least, made TripleO an extremely effective > "kitchen sink" stress test, and has uncovered numerous issues we failed to > find with out internal tests (obviously we do add coverage when we find > them). > > In the case of Ironic, I think the usage is somewhat less demanding, but no > less "real world" - here's a good example for you: > > https://bugs.launchpad.net/ironic/+bug/1507738 > > In this case, Ironic landed a change to master, which broke all existing > deployments using Centos/RHEL derived distributions, so master Ironic has > been broken for folks using those distros for over 6 weeks. > > I know in that case, the problem was really old ipxe image in the distro, > and yes there were several possible workarounds, but as a developer who > cares about users, I personally would rather get gate feedback than angry > users on IRC/email when I unwittingly break the world for them ;) > > (note, I'm not assigning any blame above, it's one of *many* examples of > unexpected breakage due to insufficient gate feedback of real usage accross > many projects). > Great example, Steve, and I agree that more and faster feedback from users into patches is a good thing. I'm also sad that it was broken for that long and no one raised the issue in our meeting until this week. This particular bug highlights a gap in Ironic's test coverage which I would be delighted if someone wants to close -- that we aren't testing support for RH-based distros. Closing that gap doesn't require TripleoCI at all; we should simply add a dsvm job for Ironic on Fedora, using a Fedora-based ramdisk. That will help prevent similar regressions in the future. Anyway, I have big reservations about putting TripleoCI on a path to ever gating Ironic patches. I started to bikeshed on that and then deleted it ... tldr; I believe it is important for this job to vote in a non-gating way. As a reviewer, I'm unlikely to pay attention to it if it doesn't vote, and there's a good reason for this: Non-voting jobs are used for experimentation. A non-voting job is a job that we want to vote, but which we don't trust enough yet. It has been promoted from the experimental pipeline to the check pipeline so that it gets a lot more runs and so that we can stabilize it enough to make it voting. I was going to suggest that tripleoci vote as a third party CI system (I know, it's not actually a third-party CI system, but I'd like to vote like one). And then I noticed that it used to do just that. [0] If I'm interpreting it correctly, the "gate-tripleo-ironic*" jobs voted from a separate account, left an informative -1, but did not block the gate. That's exactly what I would like in this case. Cheers, -Devananda [0] https://review.openstack.org/#/c/184402/
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev