# Summary

Rebuild Docker images per-build rather than use the pinned Docker Hub images in 
the [Jenkinsfile](https://github.com/apache/tvm/blob/main/Jenkinsfile#L48-L54).

# Guide

Note: This is a spin off discussion from 
[https://discuss.tvm.apache.org/t/rfc-a-proposed-update-to-the-docker-images-ci-tag-pattern/12034/2](https://discuss.tvm.apache.org/t/rfc-a-proposed-update-to-the-docker-images-ci-tag-pattern/12034/2)

We currently run various stages in Jenkins largely through stage-specific 
Docker images, such as `ci_cpu` for CPU jobs, `ci_gpu` for GPU, etc. These 
images are uploaded manually to the [tlcpack](https://hub.docker.com/u/tlcpack) 
Docker Hub account after a lengthy update process 
([example](https://github.com/apache/tvm/issues/10120)).

Instead, we could build Docker images on each job (i.e. every commit to a PR 
branch or `main`). This immediately removes the need for the manual update 
process to Docker Hub. However in the simplest set up this would add 
significant runtime to CI jobs (30-ish minutes). Using Docker caching and 
automated updates should help avoid this overhead except in cases where it is 
actually needed (i.e. during infrequent Docker images changes).

The process would look like:

- Author sends commit, Jenkins runs lint and kicks off jobs per execution 
environment (e.g. ARM, CPU, GPU)

- In each environment, `docker pull` the relevant base image from the tlcpack 
Docker Hub to populate the build cache

- Run `docker build` for the relevant image using the code from the commit

- Pack the Docker image as a `.tgz` for use in later stages

- Run the rest of the build using the saved image

- If run on `main` and the Docker image changed, upload the new version to 
tlcpack automatically

We are able to automatically update `main` since changes will not need to go 
through `ci-docker-staging` since they can be run as normal PR tests and the 
Docker images will be re-build and used in the PR.

# Drawbacks

This requires experimentation to ensure it can be implemented with low 
overhead. The Docker images for CPU and GPU environments are large (around 25 
GB) and sending this data around may have an impact on runtime.

cc @manupa-arm @leandron @areusch





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-rebuild-docker-images-per-commit/12047/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/647f01d8ec915369b46d8eff45dec523a455f696bc1b13ad129913c38e94b9b2).

Reply via email to