Back in Barcelona for the Ocata summit I presented a rough outline of a plan for us to change the way we manage dependencies across projects so that we can stop syncing them [1]. We've made some progress, and I think it's time to finish the work so I'm volunteering to take some of it up during Rocky. This email is meant to rehash and update the proposal, and fill in some of the missing details.
[1] https://etherpad.openstack.org/p/ocata-requirements-notes TL;DR ----- Let's stop copying exact dependency specifications into all our projects to allow them to reflect the actual versions of things they depend on. The constraints system in pip makes this change safe. We still need to maintain some level of compatibility, so the existing requirements-check job (run for changes to requirements.txt within each repo) will change a bit rather than going away completely. We can enable unit test jobs to verify the lower constraint settings at the same time that we're doing the other work. Some History ------------ Back in the dark ages of OpenStack development we had a lot of trouble keeping the dependencies of all of our various projects configured so they were co-installable. Usually, but not always, the problems were caused by caps or "exclusions" (version != X) on dependencies in one project but not in another. Because pip's dependency resolver does not take into account the versions of dependencies needed by existing packages, it was quite easy to install things in the "wrong" order and end up with incompatible libraries so services wouldn't start or couldn't import plugins. The first (working) solution to the problem was to develop a dependency management system based the openstack/requirements repository. This system and our policies required projects to copy exactly the settings for all of their dependencies from a global list managed by a team of reviewers (first the release team, and later the requirements team). By copying exactly the same settings into all projects we ensured that they were "co-installable" without any dependency conflicts. Having a centralized list of dependencies with a review team also gave us an opportunity to look for duplicates, packages using incompatible licenses, and otherwise curate the list of dependencies. More on that later. Some time after we had the centralized dependency management system in place, Robert Collins worked with the PyPA folks to add a feature to pip to constrain installed versions of packages that are actually installed, while still allowing a range of versions to be specified in the dependency list. We were then able to to create a list of "upper constraints" -- the highest, or newest, versions -- of all of the packages we depend on and set up our test jobs to use that list to control what is actually installed. This gives us the ability to say that we need at least version X.Y.Z of a package and to force the selection of X.Y+1.0 because we want to test with that version. The constraint feature means that we no longer need to have all of the dependency specifications match exactly, since we basically force the installation of a specific version anyway. We've been running with both constraints and requirements syncing enabled for a while now, and I think we should stop syncing the settings to allow projects to let their lower bounds (the minimum versions of their dependencies) diverge. That divergence is useful to folks creating packages for just some of the services, especially when they are going to be deployed in isolation where co-installability is not required. Skipping the syncs will also mean we end up releasing fewer versions of stable libraries, because we won't be raising the minimum supported versions of their dependencies automatically. That second benefit is my motivation for focusing on this right now. Our Requirements ---------------- We have three primary requirements for managing the dependency list: 1. Maintain a list of co-installable versions of all of our dependencies. 2. Avoid breaking or deadlocking any of our gate jobs due to dependency conflicts. 3. Continue to review new dependencies for licensing, redundancy, etc. I believe the upper-constraints.txt file in openstack/releases satisfies the first two of these requirements. The third means we need to continue to *have* a global requirements list, but we can change how we manage it. In addition to these hard requirements, it would be nice if we could test the lower bounds of dependencies in projects to detect when a project is using a feature of a newer version of a library than their dependencies indicate. Although that is a bit orthogonal to the syncing issue, I'm going to describe one way we could do that because the original plan of keeping a global list of "lower constraints" breaks our ability to stop syncing the same lower bounds into all of the projects somewhat. What I Want to Do ----------------- 1. Update the requirements-check test job to change the check for an exact match to be a check for compatibility with the upper-constraints.txt value. We would check the value for the dependency from upper-constraints.txt against the range of allowed values in the project. If the constraint version is compatible, the dependency range is OK. This rule means that in order to change the dependency settings for a project in a way that are incompatible with the constraint, the constraint (and probably the global requirements list) would have to be changed first in openstack/requirements. However, if the change to the dependency is still compatible with the constraint, no change would be needed in openstack/requirements. For example, if the global list constraints a library to X.Y.Z and a project lists X.Y.Z-2 as the minimum version but then needs to raise that because it needs a feature in X.Y.Z-1, it can do that with a single patch in-tree. We also need to change requirements-check to look at the exclusions to ensure they all appear in the global-requirements.txt list (the local list needs to be a subset of the global list, but does not have to match it exactly). We can't have one project excluding a version that others do not, because we could then end up with a conflict with the upper constraints list that could wedge the gate as we had happen in the past. We also need to verify that projects do not cap dependencies for the same reason. Caps prevent us from advancing to versions of dependencies that are "too new" and possibly incompatible. We can manage caps in the global requirements list, which would cause that list to calculate the constraints correctly. This change would immediately allow all projects currently following the global requirements lists to specify different lower bounds from that global list, as long as those lower bounds still allow the dependencies to be co-installable. (The upper bounds, managed through the upper-constraints.txt list, would still be built by selecting the newest compatible version because that is how pip's dependency resolver works.) 2. We should stop syncing dependencies by turning off the propose-update-requirements job entirely. Turning off the job will stop the bot from proposing more dependency updates to projects. As part of deleting the job we can also remove the "requirements" case from playbooks/proposal/propose_update.sh, since it won't need that logic any more. We can also remove the update-requirements command from the openstack/requirements repository, since that is the tool that generates the updated list and it won't be needed if we aren't proposing updates any more. 3. Remove the minimum specifications from the global requirements list to make clear that the global list is no longer expressing minimums. This clean-up step has been a bit more controversial among the requirements team, but I think it is a key piece. As the minimum versions of dependencies diverge within projects, there will no longer *be* a real global set of minimum values. Tracking a list of "highest minimums", would either require rebuilding the list from the settings in all projects, or requiring two patches to change the minimum version of a dependency within a project. Maintaining a global list of minimums also implies that we consider it OK to run OpenStack as a whole with that list. This message conflicts with the message we've been sending about the upper constraints list since that was established, which is that we have a known good list of versions and deploying all of OpenStack with different versions of those dependencies is untested. After these 3 steps are done, the requirements team will continue to maintain the global-requirements.txt and upper-constraints.txt files, as before. Adding a new dependency to a project will still involve a review step to add it to the global list so we can monitor licensing, duplication, python 3 support, etc. But adjusting the version numbers once that dependency is in the global list will be easier. Testing Lower Bounds of Dependencies ------------------------------------ I don't have any personal interest in us testing against "old" versions of dependencies, but since the requirements team feels at least having a plan for such testing in place is a prerequisite for the other work, here is what I've come up with. We can define a new test job to run the unit tests under python 3 using a tox environment called "lower-constraints" that is configured to install the dependencies for the repo using a file lower-constraints.txt that lives in the project repository. Then, for each repository listed in projects.txt (~325 today), we need to add the job to the zuul configuration within the repo. We don't want to turn the job on voting by default globally for all of those projects because it would break until the tox environment was configured. We don't want to turn it on non-voting because then the infra team would have ~325 patches to review as it was set to be voting for each repository individually. At some point in the future, after all of the projects have it enabled in-repo, we can move that configuration to the project-config repo in 1 patch and make the lower-constraints job part of the check-requirements job template. To configure the job in a given repo we will need to run a few separate steps to prepare a single patch like https://review.openstack.org/#/c/550603/ (that patch is experimental and contains the full job definition, which won't be needed everywhere). 1. Set up a new tox environment called "lower-constraints" with base-python set to "python3" and with the deps setting configured to include a copy of the existing global lower constraints file from the openstack/requirements repo. 2. Run "tox -e lower-constraints —notest" to build a virtualenv using the lower constraints. 3. Run ".tox/lower-constraints/bin/pip freeze > lower-constraints.txt" to create the initial version of the lower-constraints.txt file for the current repo. 4. Modify the tox settings for lower-constraints to point to the new file that was generated instead of the global list. 5. Update the zuul configuration to add the new job defined in project-config. The results of those steps can be combined into a single patch and proposed to the project. To avoid overwhelming zuul's job configuration resolver, we need to propose the patches in separate batches of about 10 repos at a time. This is all mostly scriptable, so I will write a script and propose the patches (unless someone else wants to do it all -- we need a single person to keep up with how many patches we're proposing at one time). The point of creating the initial lower-constraints.txt file is not necessarily to be "accurate" with the constraints immediately, but to have something to work from. After the patches are proposed, please either plan to land them or vote -2 indicating that you don't want a job like that on that repo. If you want to change the constraints significantly, please do that in a separate patch. With ~325 of them, I'm not going to be able to keep up with everyone's separate needs and this is all meant to just establish the initial version of the job anyway. For projects that currently only support python 2 we can modify the proposed patches to not set base-python to use python3. You will have noticed that this will only apply to unit test jobs. Projects are free to use the results to add their own functional test jobs using the same lower-constraints.txt files, but that's up to them to do. For the reasons outlined above about why we want divergence, I don't think it makes much sense to run a full integration job with the other projects, since their dependency lists may differ. Sorry for the length of this email, but we don't have a specs repo for the requirements team and I wanted to put all of the details of the proposal down in one place for discussion. Let me know what you think, Doug __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev