On 06/16/2014 05:33 AM, Thierry Carrez wrote:
David Kranz wrote:
[...]
There is a different way to do this. We could adopt the same methodology
we have now around gating, but applied to each project on its own
branch. These project branches would be integrated into master at some
frequency or when some new feature in project X is needed by project Y.
Projects would want to pull from the master branch often, but the push
process would be less frequent and run a much larger battery of tests
than we do now.
So we would basically discover the cross-project bugs when we push to
the "master master" branch. I think you're just delaying discovery of
the most complex issues, and push the responsibility to resolve them
onto a inexistent set of people. Adding integration branches only makes
sense if you have an integration team. We don't have one, so we'd call
back on the development teams to solve the same issues... with a delay.
You are assuming that the problem is cross-project bugs. A lot of these
bugs are not really bugs that
are *caused* by cross-project interaction. Many are project-specific
bugs that could have been squashed before being integrated if enough
testing had been done, but since we do all of our testing in a
fully-integrated environment we often don't know where they came from. I
am not suggesting this proposal would help much to get out of the
current jam, just make it harder to get into it again once master is
stabilized
In our specific open development setting, delaying is bad because you
don't have a static set of developers that you can assume will be on
call ready to help with what they have written a few months later:
shorter feedback loops are key to us.
I hope you did not think I was suggesting "a few months" as a typical
frequency for a project updating master. That would be unacceptable. But
there is a continuum between "on every commit" and months. I was
thinking of perhaps once a week but it would really depend on a lot of
things that happen.
Doing this would have the following advantages:
1. It would be much harder for a race bug to get in. Each commit would
be tested many more times on its branch before being merged to master
than at present, including tests specialized for that project. The
qa/infra teams and others would continue to define acceptance at the
master level.
2. If a race bug does get in, projects have at least some chance to
avoid merging the bad code.
3. Each project can develop its own gating policy for its own branch
tailored to the issues and tradeoffs it has. This includes focus on
spending time running their own tests. We would no longer run a complete
battery of nova tests on every commit to swift.
4. If a project branch gets into the situation we are now in:
a) it does not impact the ability of other projects to merge code
b) it is highly likely the bad code is actually in the project so
it is known who should help fix it
c) those trying to fix it will be domain experts in the area that
is failing
5. Distributing the gating load and policy to projects makes the whole
system much more scalable as we add new projects.
Of course there are some drawbacks:
1. It will take longer, sometimes much longer, for any individual commit
to make it to master. Of course if a super-serious issue made it to
master and had to be fixed immediately it could be committed to master
directly.
2. Branch management at the project level would be required. Projects
would have to decide gating criteria, timing of pulls, and coordinate
around integration to master with other projects.
3. There may be some technical limitations with git/gerrit/whatever that
I don't understand but which would make this difficult.
4. It makes the whole thing more complicated from a process standpoint.
An extra drawback is that you can't really do CD anymore, because your
"master master" branch gets big chunks of new code in one go at push time.
That depends on how big and delayed the chunks are. The question is "how
do we test commits enough to make sure they don't cause new races
without using vastly more resources than we have, and without it taking
days to test a patch?". I am suggesting an alternative as a possible
least-bad approach, not a panacea. I didn't think that doing CD implied
literally that the unit of integration was exactly one developer commit.
I have used this model in previous large software projects and it worked
quite well. This may also be somewhat similar to what the linux kernel
does in some ways.
Please keep in mind that some techniques which are perfectly valid (and
even recommended) when you have a captive set of developers just can't
work in our open development setting. Some techniques which work
perfectly for a release-oriented product just don't cut it when you also
want the software to be consumable in a continuous delivery fashion. We
certainly can and should learn from other experiences, but we also need
to recognize our challenges are unique and might call for some unique
solution, with its own drawbacks and benefits.
I agree with this statement but you have not described precisely what
can and cannot be done by groups of "captive developers" compared to
what we have, nor analysed that requirement in the context of other
choices we could make. Clearly the way we do things now would be easier
if we had captive developers too. It is plausible that our community two
to three years ago would have said that our current processes could not
be achieved with an open development setting. Anyway, we need to do
something and I look forward to other proposals that have fewer flaws
than this one.
-David
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev