On 06/16/2014 05:33 AM, Thierry Carrez wrote:
David Kranz wrote:
[...]
There is a different way to do this. We could adopt the same methodology
we have now around gating, but applied to each project on its own
branch. These project branches would be integrated into master at some
frequency or when some new feature in project X is needed by project Y.
Projects would want to pull from the master branch often, but the push
process would be less frequent and run a much larger battery of tests
than we do now.
So we would basically discover the cross-project bugs when we push to
the "master master" branch. I think you're just delaying discovery of
the most complex issues, and push the responsibility to resolve them
onto a inexistent set of people. Adding integration branches only makes
sense if you have an integration team. We don't have one, so we'd call
back on the development teams to solve the same issues... with a delay.
You are assuming that the problem is cross-project bugs. A lot of these bugs are not really bugs that are *caused* by cross-project interaction. Many are project-specific bugs that could have been squashed before being integrated if enough testing had been done, but since we do all of our testing in a fully-integrated environment we often don't know where they came from. I am not suggesting this proposal would help much to get out of the current jam, just make it harder to get into it again once master is stabilized

In our specific open development setting, delaying is bad because you
don't have a static set of developers that you can assume will be on
call ready to help with what they have written a few months later:
shorter feedback loops are key to us.
I hope you did not think I was suggesting "a few months" as a typical frequency for a project updating master. That would be unacceptable. But there is a continuum between "on every commit" and months. I was thinking of perhaps once a week but it would really depend on a lot of things that happen.

Doing this would have the following advantages:

1. It would be much harder for a race bug to get in. Each commit would
be tested many more times on its branch before being merged to master
than at present, including tests specialized for that project. The
qa/infra teams and others would continue to define acceptance at the
master level.
2. If a race bug does get in, projects have at least some chance to
avoid merging the bad code.
3. Each project can develop its own gating policy for its own branch
tailored to the issues and tradeoffs it has. This includes focus on
spending time running their own tests. We would no longer run a complete
battery of nova tests on every commit to swift.
4. If a project branch gets into the situation we are now in:
      a) it does not impact the ability of other projects to merge code
      b) it is highly likely the bad code is actually in the project so
it is known who should help fix it
      c) those trying to fix it will be domain experts in the area that
is failing
5. Distributing the gating load and policy to projects makes the whole
system much more scalable as we add new projects.

Of course there are some drawbacks:

1. It will take longer, sometimes much longer, for any individual commit
to make it to master. Of course if a super-serious issue made it to
master and had to be fixed immediately it could be committed to master
directly.
2. Branch management at the project level would be required. Projects
would have to decide gating criteria, timing of pulls, and coordinate
around integration to master with other projects.
3. There may be some technical limitations with git/gerrit/whatever that
I don't understand but which would make this difficult.
4. It makes the whole thing more complicated from a process standpoint.
An extra drawback is that you can't really do CD anymore, because your
"master master" branch gets big chunks of new code in one go at push time.
That depends on how big and delayed the chunks are. The question is "how do we test commits enough to make sure they don't cause new races without using vastly more resources than we have, and without it taking days to test a patch?". I am suggesting an alternative as a possible least-bad approach, not a panacea. I didn't think that doing CD implied literally that the unit of integration was exactly one developer commit.

I have used this model in previous large software projects and it worked
quite well. This may also be somewhat similar to what the linux kernel
does in some ways.
Please keep in mind that some techniques which are perfectly valid (and
even recommended) when you have a captive set of developers just can't
work in our open development setting. Some techniques which work
perfectly for a release-oriented product just don't cut it when you also
want the software to be consumable in a continuous delivery fashion. We
certainly can and should learn from other experiences, but we also need
to recognize our challenges are unique and might call for some unique
solution, with its own drawbacks and benefits.

I agree with this statement but you have not described precisely what can and cannot be done by groups of "captive developers" compared to what we have, nor analysed that requirement in the context of other choices we could make. Clearly the way we do things now would be easier if we had captive developers too. It is plausible that our community two to three years ago would have said that our current processes could not be achieved with an open development setting. Anyway, we need to do something and I look forward to other proposals that have fewer flaws than this one.

 -David

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to