[Apache TVM Discuss] [Development/pre-RFC] [RFC] Rebuild Docker images per commit

driazati via Apache TVM Discuss Tue, 08 Feb 2022 13:37:41 -0800


# Summary

Rebuild Docker images per-build rather than use the pinned Docker Hub images in
the [Jenkinsfile](https://github.com/apache/tvm/blob/main/Jenkinsfile#L48-L54).

# Guide

Note: This is a spin off discussion from
[https://discuss.tvm.apache.org/t/rfc-a-proposed-update-to-the-docker-images-ci-tag-pattern/12034/2](https://discuss.tvm.apache.org/t/rfc-a-proposed-update-to-the-docker-images-ci-tag-pattern/12034/2)

We currently run various stages in Jenkins largely through stage-specific
Docker images, such as `ci_cpu` for CPU jobs, `ci_gpu` for GPU, etc. These
images are uploaded manually to the [tlcpack](https://hub.docker.com/u/tlcpack)
Docker Hub account after a lengthy update process
([example](https://github.com/apache/tvm/issues/10120)).

Instead, we could build Docker images on each job (i.e. every commit to a PR
branch or `main`). This immediately removes the need for the manual update
process to Docker Hub. However in the simplest set up this would add
significant runtime to CI jobs (30-ish minutes). Using Docker caching and
automated updates should help avoid this overhead except in cases where it is
actually needed (i.e. during infrequent Docker images changes).

The process would look like:

- Author sends commit, Jenkins runs lint and kicks off jobs per execution
environment (e.g. ARM, CPU, GPU)

- In each environment, `docker pull` the relevant base image from the tlcpack
Docker Hub to populate the build cache

- Run `docker build` for the relevant image using the code from the commit

- Pack the Docker image as a `.tgz` for use in later stages

- Run the rest of the build using the saved image

- If run on `main` and the Docker image changed, upload the new version to
tlcpack automatically

We are able to automatically update `main` since changes will not need to go
through `ci-docker-staging` since they can be run as normal PR tests and the
Docker images will be re-build and used in the PR.

# Drawbacks

This requires experimentation to ensure it can be implemented with low
overhead. The Docker images for CPU and GPU environments are large (around 25
GB) and sending this data around may have an impact on runtime.

cc @manupa-arm @leandron @areusch

---
[Visit
Topic](https://discuss.tvm.apache.org/t/rfc-rebuild-docker-images-per-commit/12047/1)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/647f01d8ec915369b46d8eff45dec523a455f696bc1b13ad129913c38e94b9b2).

[Apache TVM Discuss] [Development/pre-RFC] [RFC] Rebuild Docker images per commit

Reply via email to