Hi,

Thanks a lot for sharing these.

I am looking through the tests that we run, and how we run them, as I would
really like to take a hit at it. However, I can't commit to this without
some agreement.

I took a hard look at archery and most of our builds, and these are my
observations:

* we heavily rely on docker image caches to store artifacts, by using the
cache action on `.docker`, as well as pushing images to a registry and
fetching them.
* we use a docker-compose file to enumerate all our builds (currently +1k
LOC)
* we use a custom made Python package (archery) for an heterogeneous set of
tasks (release, merge PRs, run docker-compose, run docker)

Let's evaluate the execution path of one of our major runs, the integration
tests that run on every push:

1. build is triggered by the workflow `integration.yaml` on every push, and
on every change to every implementation
2. this installs Python, archery, docker-compose and runs `archery docker
run conda-integration`
3. this calls the equivalent of `docker-compose run conda-integration`
4. this:
    4.1 builds a docker image `conda-integration.dockerfile` that contains
Python, conda, Archery, Go, maven, rust, node, all of these installed via
conda
    4.2 uses this image to build every implementation (using `docker run
CMD_TO_BUILD_ALL`)
5. runs all integration tests

Step 1-4 takes 20-35m and step 5 takes 5m, irrespectively of any code
changed. IMO there is a potential for a major improvement here.

Some opinionated observations that demotivated me from progressing:

1. The current setup tightly couples the build of all implementations,
making it difficult to refactor and simplify. I.e. We have one docker image
to build all implementations, and build them all in a single command
2. we use conda to install dependencies such as maven, node, jdk and go
3. We use Python/archery for almost everything, even when a simpler
`docker-compose run X` would suffice

With this said, I have two changes to the current design that I would like
to work, if there is buy-in for the general ideas:

1. make every artifact an independent build.

The integration test can be broadly described by a DAG with the following
link list:

     cpp artifacts <- test result
     js artifacts <- test result
     go artifacts <- test result
     rust artifacts <- test result
     ...

My suggestion is that instead of running 1 job that builds all of these
artifacts at once + the test execution, we use N+1 jobs that build each
artifact independently and the job "test" picks these (cached via the cache
flow) and runs the actual test. This segmentation will allow us to cache
the artifacts when code does not change, which will significantly improve
the aforementioned performance issue.

2. Make every build environment dedicated to what it is being built

I.e. Instead of preparing 1 docker image that builds all of these artifacts
at once, we prepare N docker images that build each of the artifacts. I.e.
use a docker image to build rust, one to build go, one to build c++, etc.
This eliminates the tight coupling that currently exists between building
these implementations.

Note that I see this as a stop gap. IMO we should use the artifacts built
from each individual implementation on the integration tests by sharing the
artifacts  (and not even run integration tests if the artifact cannot be
produced / compilation error) instead of building them twice.

Any thoughts?

Best,
Jorge






On Mon, Nov 23, 2020 at 9:58 PM Krisztián Szűcs <szucs.kriszt...@gmail.com>
wrote:

> On Mon, Nov 23, 2020 at 3:38 PM Antoine Pitrou <anto...@python.org> wrote:
> >
> >
> > Hello,
> >
> > (sorry, disregard the previous e-mail, I pressed the Send button too
> early)
> >
> > The folks at the apache-builds mailing-list gathered some statistics
> > about GHA usage of various Apache projects:
> >
> >
> https://mail-archives.apache.org/mod_mbox/www-builds/202011.mbox/%3CCADe6CU_a5_HhGNFNGGYwfCdJR0-yPxOuAwnKxaPRvnOOPp86sA%40mail.gmail.com%3E
> >
> >
> https://docs.google.com/spreadsheets/d/1SE9HIHBPmTZuW1WAgdVbEcGouGesiyrnXDIZxx25RSE/edit#gid=0
> >
> > It seems Arrow is the third biggest consumer of Apache GHA CI resources,
> > if measured by median number of in-progress workflow runs.
> > (I'm not sure whether this measures individual jobs, or if several jobs
> > are counted as a single workflow, given that GHA has a rather bizarre
> model)
> Thanks for the heads up!
>
> We have a high queued max value because of the post-release mass PR
> rebase script which distorts the average values as well.
> Based on the medians I don't think that we extremely overuse our GHA
> capacity portion.
>
> On the other hand we can remove a couple of low priority builds (or
> schedule them as nightlies).
>
> Regards, Krisztian
> >
> > Regards
> >
> > Antoine.
> >
>

Reply via email to