On Fri, May 19, 2023 at 1:51 PM Daniel P. Berrangé <berra...@redhat.com> wrote: > > On Fri, May 19, 2023 at 01:33:50PM +0100, Camilla Conte wrote: > > On Fri, May 19, 2023 at 10:00 AM Daniel P. Berrangé <berra...@redhat.com> > > wrote: > > > > > > On Fri, Apr 07, 2023 at 03:52:51PM +0100, Camilla Conte wrote: > > > > Configure Gitlab CI to run on Kubernetes > > > > according to the official documentation. > > > > https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#docker-in-docker-with-tls-enabled-in-kubernetes > > > > > > > > These changes are needed because of the CI jobs > > > > using Docker-in-Docker (dind). > > > > As soon as Docker-in-Docker is replaced with Kaniko, > > > > these changes can be reverted. > > > > > > > > I documented what I did to set up the Kubernetes runner on the wiki: > > > > https://wiki.qemu.org/Testing/CI/KubernetesRunners > > > > > > > > Signed-off-by: Camilla Conte <cco...@redhat.com> > > > > --- > > > > .gitlab-ci.d/container-template.yml | 6 +++--- > > > > .gitlab-ci.d/default.yml | 3 +++ > > > > .gitlab-ci.d/opensbi.yml | 8 +++----- > > > > .gitlab-ci.d/qemu-project.yml | 17 +++++++++++++++++ > > > > 4 files changed, 26 insertions(+), 8 deletions(-) > > > > create mode 100644 .gitlab-ci.d/default.yml > > > > > > > > diff --git a/.gitlab-ci.d/container-template.yml > > > > b/.gitlab-ci.d/container-template.yml > > > > index 519b8a9482..f55a954741 100644 > > > > --- a/.gitlab-ci.d/container-template.yml > > > > +++ b/.gitlab-ci.d/container-template.yml > > > > @@ -1,14 +1,14 @@ > > > > .container_job_template: > > > > extends: .base_job_template > > > > - image: docker:stable > > > > + image: docker:20.10.16 > > > > stage: containers > > > > services: > > > > - - docker:dind > > > > + - docker:20.10.16-dind > > > > before_script: > > > > - export TAG="$CI_REGISTRY_IMAGE/qemu/$NAME:latest" > > > > - export > > > > COMMON_TAG="$CI_REGISTRY/qemu-project/qemu/qemu/$NAME:latest" > > > > - apk add python3 > > > > - - docker info > > > > + - until docker info; do sleep 1; done > > > > - docker login $CI_REGISTRY -u "$CI_REGISTRY_USER" -p > > > > "$CI_REGISTRY_PASSWORD" > > > > script: > > > > - echo "TAG:$TAG" > > > > diff --git a/.gitlab-ci.d/default.yml b/.gitlab-ci.d/default.yml > > > > new file mode 100644 > > > > index 0000000000..292be8b91c > > > > --- /dev/null > > > > +++ b/.gitlab-ci.d/default.yml > > > > @@ -0,0 +1,3 @@ > > > > +default: > > > > + tags: > > > > + - $RUNNER_TAG > > > > > > Can we just put this in base.yml instead of creating a new file. > > > > Sure. > > > > > > diff --git a/.gitlab-ci.d/opensbi.yml b/.gitlab-ci.d/opensbi.yml > > > > index 9a651465d8..5b0b47b57b 100644 > > > > --- a/.gitlab-ci.d/opensbi.yml > > > > +++ b/.gitlab-ci.d/opensbi.yml > > > > @@ -42,17 +42,15 @@ > > > > docker-opensbi: > > > > extends: .opensbi_job_rules > > > > stage: containers > > > > - image: docker:stable > > > > + image: docker:20.10.16 > > > > services: > > > > - - docker:stable-dind > > > > + - docker:20.10.16-dind > > > > > > Can you elaborate on this ? I know the docs about use that particular > > > version tag, but they don't appear to explain why. If this is not > > > actually a hard requirements, we should keep using the stable tag. > > > > Yes, we can keep using "stable". > > Then, we should be ready to address future issues that may arise from > > "stable" not being compatible with the runner. > > > > > > variables: > > > > GIT_DEPTH: 3 > > > > IMAGE_TAG: $CI_REGISTRY_IMAGE:opensbi-cross-build > > > > - # We don't use TLS > > > > - DOCKER_HOST: tcp://docker:2375 > > > > - DOCKER_TLS_CERTDIR: "" > > > > > > So IIUC, this was always redundant when using gitlab CI. We should just > > > remove these in a standalone commit. > > > > Okay, I'll put this in a separate commit. > > > > > > before_script: > > > > - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD > > > > $CI_REGISTRY > > > > + - until docker info; do sleep 1; done > > > > > > Was this really needed ? The docs don't show that, and docker login is > > > synchronous, so I wouldn't expect us to them poll on 'docker info'. > > > > Unfortunately, yes. We need to wait until the "docker info" command is > > successful. This ensures that the Docker server has started and the > > subsequent docker commands won't fail. > > > > > > In container-template.yml we in fact do the reverse > > > > > > - docker info > > > - docker login $CI_REGISTRY -u "$CI_REGISTRY_USER" -p > > > "$CI_REGISTRY_PASSWORD" > > > > About "docker login", as far as I understand it's a client-only > > command. It doesn't involve the Docker server at all. These two > > commands are not related to each other, it doesn't matter if "docker > > login" runs before or after "docker info". > > > > > imho best make this opensbi.yml file match contanier-template.yml, and > > > could be part of the same cleanup commit that removes thhose two docker > > > env vars. > > > > You mean to replace the "docker-opensbi" job in the "opensbi.yml" file > > with the same as the ".container_job_template" from the > > "container-template.yml" file? > > These two look too much different to me. I think we need to keep both.
> No, I didn't mean we have to merge them. Just that the container-template.yml > file merely does 'docker info' without any loop. So either that one is broken, > or using a loop in opensbi.yml is redundant. > > Assuming you've tested this series on k8s successfully, it would indicate > that the looping is not required, otherwise all the container jobs would > have failed. Actually, I added the 'docker info' loop in the container-template.yml file too. Or am I missing your point? > > > > script: > > > > - docker pull $IMAGE_TAG || true > > > > - docker build --cache-from $IMAGE_TAG --tag > > > > $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA > > > > diff --git a/.gitlab-ci.d/qemu-project.yml > > > > b/.gitlab-ci.d/qemu-project.yml > > > > index a7ed447fe4..57b175f5c2 100644 > > > > --- a/.gitlab-ci.d/qemu-project.yml > > > > +++ b/.gitlab-ci.d/qemu-project.yml > > > > @@ -1,7 +1,24 @@ > > > > # This file contains the set of jobs run by the QEMU project: > > > > # https://gitlab.com/qemu-project/qemu/-/pipelines > > > > > > > > +variables: > > > > + RUNNER_TAG: "" > > > > + > > > > +workflow: > > > > + rules: > > > > + # Set additional variables when running on Kubernetes. > > > > + # https://wiki.qemu.org/Testing/CI/KubernetesRunners > > > > + - if: $RUNNER_TAG == "k8s" > > > > + variables: > > > > + DOCKER_HOST: tcp://docker:2376 > > > > + DOCKER_TLS_CERTDIR: "/certs" > > > > + DOCKER_TLS_VERIFY: 1 > > > > + DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client" > > > > > > Is there anyway we can get the runner itself to set these > > > correctly by default ? > > > > Yes, the runner can set environment variables for its jobs. > > > > My concern here is that over time we lose visibility of these > > customizations if we put them in the runner configuration. > > This can be solved by having a repo in the qemu-project namespace to > > host the runner configuration, something like I did here: > > https://gitlab.com/spotlesstofu/qemu-ci-kubernetes. > Cleber put configs for the current QEMU private runners into > the main qemu.git at scripts/ci/setup/ We should have any > setup for the k8s runner somewhere nearby too. Or move all > of it out into a separate repository. Okay, I'll add a patch to put the runner configuration near scripts/ci/setup/. > > > IMHO the ideal would be that the k8s runners are registerd with the > > > qemu project to run *any* jobs without requiring tags. That way the > > > runners will "just work" when share runners are unavailable/exhausted, > > > like we have with Eldon's runner > > > > The problem here is that the Kubernetes (k8s) runner can't run windows > > jobs at the moment. If we wait for the shared runners to be exhausted, > > those few windows jobs in the pipeline won't be able to run. > Hmm, that's awkward. I'm not convinced we should be expecting the maintainer > doing staging builds to decide whether or not to the set tag when running a > pipeline. I guess maybe we can set 'RUNNER_TAG' in the web UI settings for > CI, to turn it on globally for qemu.git upstream. Setting the RUNNER_TAG variable in the web UI would work. Someone would have to switch (remove) the RUNNER_TAG variable whenever we want to use the shared runners instead of the Kubernetes runner. This variable change could probably be automated to switch runners when a certain amount of remaining shared runners minutes is reached. > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| >