While I've been working on bringing CircleCI to parity with Jenkins, I've made some notes about ways that the whole config generation process could be improved. Here are my thoughts. I'm not sure if this is worthy of a CEP since it's infra and not a feature.
Cheers, Derek Problem Statement ═════════════════ The CircleCI configuration subdivides various steps in the test plan into jobs that can execute independently. This set of jobs are intended to be run by developers under the free/OSS plan as well as under paid CircleCI plans for developers or organizations that wish to spend money to obtain faster test results by committing more resources (LOWRES, MIDRES, HIGHRES). The allocation of resources is currently driven by a shell script that performs textual modification (e.g. patch and sed) of files before processing to effect changes in various configuration parameters. While this approach works, it does not fully utilize the parameterization features of CircleCI’s configuration processor to reduce complexity when adding new tests or making changes to the system, imposing additional burden on developers modifying the CI configuration. This proposal details an initial goal for reducing CircleCI configuration complexity, and provides a high level overview of subsequent goals to be investigated. Goal 1: Eliminate patch files ═════════════════════════════ Patch files are targeted as the first goal for this proposal because there is a significant reduction in configuration complexity for a relatively modest effort. Patch files themselves are brittle; the patch tool can accommodate some changes between the original target file and the current state, but cannot also unambiguously apply changes. When the CircleCI configuration is changed, the patch files also need to be changed or regenerated to match line numbers and any new sections added. This is extra work that does not provide any benefit. The patch files currently apply changes to three types of configuration: • Heap size parameters (only for the HIGHRES config) • Job resource class • Executor parallelism CircleCI handles this use case via parameterization of the configuration. Interestingly, our CircleCI configuration already takes advantage of parameterization in the definition of the executor: ┌──── │ java8-executor: │ parameters: │ exec_resource_class: │ type: string └──── CircleCI additionally allows for parameters to be defined at the top level of the pipeline, which are then accessible anywhere in the pipeline definition (e.g. steps, jobs, etc). These parameters can be overridden by providing a yaml file to the `circleci config process' command. As an example of what a change would entail, consider that the patch files change the parallelism of all repated dtest executors uniformly. We could introduce a single pipeline parameter for this value: ┌──── │ parameters: │ repeated_dtest_parallelism: │ type: integer │ default: 4 └──── And then update the configuration of the executors to use the parameter: ┌──── │ j8_repeated_utest_executor: &j8_repeated_utest_executor │ executor: │ name: java8-executor │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> │ │ j8_repeated_dtest_executor: &j8_repeated_dtest_executor │ executor: │ name: java8-executor │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> │ │ j8_repeated_upgrade_dtest_executor: &j8_repeated_upgrade_dtest_executor │ executor: │ name: java8-executor │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> │ │ j8_repeated_jvm_upgrade_dtest_executor: &j8_repeated_jvm_upgrade_dtest_executor │ executor: │ name: java8-executor │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> └──── Then we create a MIDRES-specific override file containing the new value: ┌──── │ repeated_dtest_parallelism: 25 └──── and a HIGHRES-specific override file: ┌──── │ repeated_dtest_parallelism: 100 └──── And then execute the config processor with the proper override: ┌──── │ circleci config process --pipeline-parameters MIDRES-params.yml config-2_1.yml │ # or │ circleci config process --pipeline-parameters HIGHRES-params.yml config-2_1.yml └──── If a new job is added that does not use any parameters, only the `config-2_1.yml' file is modified. If new parameters are introduced, then the override files also need updated. One option (although not recommended) to this approach to help catch missed values for MID and HIGH profiles would be to preclude the use of default values and provide a LOWRES-params.yml file with the default values. One caveat is that there was a bug in the CircleCI CLI prior to version 0.1.22322 that resulted in the override file being ignored. One possible workaround, if we do not want to require a minimum circleci CLI version, would be to define the parameters in their own file and concatenate the level-specific definition file with the config-2_1.yml file prior to processing. Either approach (yaml file override or concatenation) provides benefit since there would be no need to modify the level-specific file as the main configuration is changed, unlike with patch files. The changes needed to implement this are relatively small, and are easy to test, since a change to use pipeline parameters (along with requisite changes to `generate.sh') should not result in any changes to the config.yml files. Other benefits for future investigation ═══════════════════════════════════════ Beyond simple replacement of the patch files, there are some additional capabilities in CircleCI configuration that may benefit the project long-term. One of the most interesting is the concept of matrix jobs. Matrix jobs allow you to parameterize a job over multiple values. In our case, this could potentially allow us to parameterize jobs over different version of the JDK. This would reduce the overall size of the configuration, because the current approach duplicates jobs for each Java version. It would also make it simpler to add/test new Java versions by making a relatively small number of changes to the matrix instead of creating a whole set of version-specific jobs. We would still need to create version-specific executors, but if we move things like parallelism and resource class out of the executor and into the job definition, we can parameterize in the workflow and reduce the number of executors to one per Java version. For example: ┌──── │ jobs: │ j8_unit_tests: │ parameters: │ executor: │ type: executor │ executor: << parameters.executor >> │ parallelism: << pipeline.parameters.unit_test_parallelism >> │ resource_class: << pipeline.parameters.unit_test_resclass >> │ steps: │ - attach_workspace: │ at: /home/cassandra │ - create_junit_containers │ - log_environment │ - run_parallel_junit_tests │ │ workflows: │ pre-commit_tests: │ jobs: │ - j8_unit_tests: │ requires: │ - build │ matrix: │ parameters: │ executor: [java8-executor, java11-executor, java17-executor] └──── -- +---------------------------------------------------------------+ | Derek Chen-Becker | | GPG Key available at https://keybase.io/dchenbecker and | | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | +---------------------------------------------------------------+