Hello Everyone,

I have a kind question and request for your opinions about using external
Docker images and downloaded binaries in the official releases for Apache
Airflow.

The question is: How much can we rely on those images being available in
those particular cases:

A) during static checks
B) during unit tests
C) for building production images for Airflow
D) for releasing production Helm Chart for Airflow

Some more explanation:

For a long time we are doing A) and B) in Apache Airflow and we followed a
practice that when we found an image that is goo for us and seems "legit"
we are using it. Example -
https://hub.docker.com/r/hadolint/hadolint/dockerfile/ - HadoLint image to
check our Dockerfiles.  Since this is easy to change pretty much
immediately, and only used for building/testing, I have no problem with
this, personally and I think it saves a lot of time and effort to maintain
some of those images.

But we are just about to start releasing Production Image and Helm Chart
for Apache Airflow and I started to wonder if this is still acceptable
practice when - by releasing the code - we make our users depend on those
images.

We are going to officially support both - image and helm chart by the
community and once we release the image and helm chart officially, those
external images and downloads will become dependencies to our official
"releases". We are allowing our users to use our official Dockerfile
to build a new image (with user's configuration) and Helm Chart is going to
be officially available for anyone to install Airflow.

The Docker images that we are using are from various sources:

1) officially maintained images (Python, KinD, Postgres, MySQL for example)
2) images released by organizations that released them for their own
purpose, but they are not "officially maintained" by those organizations
3) images released by private individuals

While 1) is perfectly OK for both image and helm chart, I think for 2) and
3) we should bring the images to Airflow community management.

Here is the list of those images I found that we use:

   - aneeshkj/helm-unittest
   - ashb/apache-rat:0.13-1
   - godatadriven/krb5-kdc-server
   - polinux/stress (?)
   - osixia/openldap:1.2.0
   - astronomerinc/ap-statsd-exporter:0.11.0
   - astronomerinc/ap-pgbouncer:1.8.1
   - astronomerinc/ap-pgbouncer-exporter:0.5.0-1

Some of those images are released by organizations that are strong
stakeholders in the project (Astronomer especially). Some other images are
by organizations that are still part of the community but not as strong
stakeholders (GoDataDriven) - some others are by private individuals who
are contributors (Ash, Aneesh) and some others are not-at-all connected to
Apache Airflow (polinux, osixia).

For me quite clearly - we are ok to rely on "officially" maintained images
and we are not ok to rely on images released by individuals in this case.
But there is a range of images in-between that I have no clarity about.

So my questions are:

1) Is this acceptable to have a non-officially released image as a
dependency in released code for the ASF project?

2) If it's not - how do we determine which images are "officially
maintained".

3) If yes - how do we put the boundary - when image is acceptable? Are
there any criteria we can use or/ constraints we can put on the
licences/organizations releasing the images we want to make dependencies
for released code of ours?

4) If some images are not acceptable, shoud we bring them in and release
them in a community-managed registry?

I would love to hear some opinions about those questions. Is this being
discussed at other projects? How other projects are solving it if any? What
registries (if any) are you using for that?

I am happy to provide more context if needed but we have this issue created
with more details: https://github.com/apache/airflow/issues/9401 and this
discussion started about it:
https://lists.apache.org/thread.html/r0d0f6f5b3880984f616d703f2abcdef98ac13a070c4550140dcfcacf%40%3Cdev.airflow.apache.org%3E


J.

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to