I don't really see why this thread seems to keep coming back to a position of improvements to the review process vs changes to automated testing - to my mind both are equally important and complementary parts of the solution:
- Automated tests are strong for objective examination of particular points of functionality. Each additional tests adds one more piece of functionality that's covered - The review process is strong for a subjective examination of changes, and can often spot more holistic issues. Changes / improvements to the review process have the potential to address whole classes of issues. > -----Original Message----- > From: Sean Dague [mailto:s...@dague.net] > Sent: 02 January 2014 00:17 > To: Tim Bell; OpenStack Development Mailing List (not for usage questions) > Subject: Re: [openstack-dev] [nova] minimum review period for functional > changes that break backwards compatibility > > On 01/01/2014 04:30 PM, Tim Bell wrote: > > > >> - Changes in default behaviour: Always likely to affect existing systems > >> in > some way. Maybe we should have an additional type of review vote that > comes from people who are recognised as reperensting large production > deployments ? > > > > This is my biggest worry... there are changes which may be technically > > valid > but have a significant disruptive impact on those people who are running > clouds. Asking the people who are running production OpenStack clouds to > review every patch to understand the risks and assess the migration impact is > asking a lot. > > > > What processes are in place to ensure that a technically correct patch does > not have major impact on those of us who are running large scale OpenStack > clouds in production ? DocImpact tagging is not sufficient as this implies the > change is the right thing to do but needs procedures defined. > > So I think this comes down to Humans vs. Machines. Which is why I keep > arguing that I don't think policy changes will help. This is my synthesis of a > few things, and a lot of time observing our process here. > > Humans are limited to how big a code chunk they can effectively review > (basically once a patch is over 200 lines, error detection rates start > dropping), > and how much time they can spend on code review in a day (eventually you > go code blind, and some times you don't realize it, which is the super > dangerous point) [1]. When humans screw up on code review (and they > figure out they screw up), their instinct is to do less review, because no one > wants to be doing a crappy job here (no footnote, just something I've > observed). > > Adding policy complexity also reduces that review bandwidth, as more > cognitive load, of any type, reduces your decision making ability. [2] This is > actually the core rationale behind the hacking project. It reduces cognitive > load by taking a whole class of things off the table and makes it a machine > decision. Take the fact that I needed to highlight to folks on the list, and > irc, > multiple times, that approving changes with test results that are a month old, > isn't a good idea (and has gate impacts), demonstrates the load level > reviewers are already on in realizing the impact of a patch, so I think we > can't > win there. > > Machines.... we're very good on scalability there, running all our tests in > clouds, and the like. > > Lets take an example pain point, code that was working on Grizzly doesn't > start when you go to Havana, because there was an incompatible config file > change. That's terrible. So we built a system called Grenade. We use the > Grizzly config files with Havana code levels, and it has to pass Tempest smoke > tests. Does it catch everything? Not at all. > But I've watched it legitimately block a dozen changes so far in icehouse that > were not getting caught in reviews, because it wasn't until people came > looking for me did the author realize they were actually breaking an upgrade > path. So it is a progressive improvement. > > So.... coming back to your point in question, I think we need to figure out > what the biggest pain points are for large scale OpenStack clouds are, then > encode them back into validation in our pipeline some how. > Michael Still recently added "turbo hipster" 3rd party testing which ensures > new code doesn't dramatically increase the time it takes to run the database > migrations on large data sets. That's going after one of these issues, and is > a > really great addition to our toolset. > > For others we should figure out a top one or two problems that we can turn > into gate automation that we can implement in each cycle (my experience is > encoding the logic on each of these is going to take the better part of a > cycle). The real advantage here is that these are gifts that keep on giving. > > Everyone is trying to be cognizant here of the impact on users. However, we > all are just human. > > [1] - > http://www.ibm.com/developerworks/rational/library/11-proven-practices- > for-peer-review/ > [2] - > http://www.jstor.org/discover/10.2307/2489029?uid=3739832&uid=2&uid=4 > &uid=3739256&sid=21103190092191 > > -Sean > > -- > Sean Dague > Samsung Research America > s...@dague.net / sean.da...@samsung.com > http://dague.net > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev