On Thu, 2014-09-11 at 07:36 -0400, Sean Dague wrote: > >>> b) The conflict Dan is speaking of is around the current situation where > >>> we > >>> have a limited core review team bandwidth and we have to pick and choose > >>> which virt driver-specific features we will review. This leads to bad > >>> feelings and conflict. > >> > >> The way this worked in the past is we had cores who were subject > >> matter experts in various parts of the code -- there is a clear set of > >> cores who "get" xen or libivrt for example and I feel like those > >> drivers get reasonable review times. What's happened though is that > >> we've added a bunch of drivers without adding subject matter experts > >> to core to cover those drivers. Those newer drivers therefore have a > >> harder time getting things reviewed and approved. > > > > FYI, for Juno at least I really don't consider that even the libvirt > > driver got acceptable review times in any sense. The pain of waiting > > for reviews in libvirt code I've submitted this cycle is what prompted > > me to start this thread. All the virt drivers are suffering way more > > than they should be, but those without core team representation suffer > > to an even greater degree. And this is ignoring the point Jay & I > > were making about how the use of a single team means that there is > > always contention for feature approval, so much work gets cut right > > at the start even if maintainers of that area felt it was valuable > > and worth taking. > > I continue to not understand how N non overlapping teams makes this any > better. You have to pay the integration cost somewhere. Right now we're > trying to pay it 1 patch at a time. This model means the integration > units get much bigger, and with less common ground.
OK, so look at a concrete example: in 2002, the Linux kernel went with bitkeeper precisely because we'd reached the scaling limit of a single integration point, so we took the kernel from a single contributing team to a bunch of them. This was expanded with git in 2005 and leads to the hundreds of contributing teams we have today. The reason this scales nicely is precisely because the integration costs are lower. However, there are a couple of principles that really assist us getting there. The first is internal API management: an Internal API is a contract between two teams (may be more, but usually two). If someone wants to change this API they have to negotiate between the two (or more) teams. This naturally means that only the affected components review this API change, but *only* they need to review it, so it doesn't bubble up to the whole kernel community. The second is automation: linux-next and the zero day test programme build and smoke test an integration of all our development trees. If one team does something that impacts another in their development tree, this system gives us immediate warning. Basically we run continuous integration, so when Linus does his actual integration pull, everything goes smoothly (that's how we integrate all the 300 or so trees for a kernel release in about ten days). We also now have a lot of review automation (checkpatch.pl for instance), but that's independent of the number of teams In this model the scaling comes from the local reviews and integration. The more teams the greater the scaling. The factor which obstructs scaling is the internal API ... it usually doesn't make sense to separate a component where there's no API between the two pieces ... however, if you think there should be, separating and telling the teams to figure it out is a great way to generate the API. The point here is that since an API is a contract, forcing people to negotiate and abide by the contract tends to make them think much more carefully about it. Internal API moves from being a global issue to being a local one. By the way, the extra link work is actually time well spent because it means the link APIs are negotiated by teams with use cases not just designed by abstract architecture. The greater the link pain the greater the indication that there's an API problem and the greater the pressure on the teams either end to fix it. Once the link pain is minimised, the API is likely a good one. > Look at how much active work in crossing core teams we've had to do to > make any real progress on the neutron replacing nova-network front. And > how slow that process is. I think you'll see that hugely show up here. Well, as I said, separating the components leads to API negotiation between the teams Because of the API negotiation, taking one thing and making it two does cause more work, and it's visible work because the two new teams get to do the API negotiation which didn't exist before. The trick to getting the model to scale is the network effect. The scaling comes by splitting out into high numbers of teams (say N) the added work comes in the links (the API contracts) between the N teams. If the network is star shaped (everything touches everything else), then you've achieved nothing other than a large increase in work because you now have N(N-1) links to negotiate and they're global not local. However, if the connections are nicely local and hierarchical, you find that the number of connections is much lower than N^2 (in fact o(N) is the ideal scaling ratio because that means the network is largely local) and you gain a lot of scaling. The point is just separating may not give scale because you also need to minimise the number of links by seeking locality. In the nova case, separating nova core from drivers core will give you a link, but the API is fairly well known already. Separating individual drivers from driver core gives you one link per driver, but no link to the core, so if you make the drivers core team up from the drivers teams, you should give added velocity to driver development and review and it only needs to negotiate with the nova core if the driver core API needs modifying. The drivers can negotiate with the drivers core for their local needs. The pain on the nova<->driver core link is what keeps that API sane because changes have to be well considered. James _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev