Hi All! Good to hear from you @gsemet.
> - do you want to take full "ownership" of it, if so, how to proceed ? How the governance of the chart should be managed? Yes I think that adding me and Jarek would be a good starting point with the helm chart. We can figure out long-term governance later. > - The current implementation use Celery executor, do you plan into > switching completely to the Kubernetes Executor, or maintaining 2 > configurations in the template files? Maybe a simple value for the > executable would be enough, I don't know, I did not tried the kube executor > actually. To my understanding, the kube executor starts/stops each task in > a pod, which comes with major cost for each simple tasks. Would it be > possible to keep the current Celery as default executor and let user switch > to the kube executor? The astronomer helm charts have a current set-up that make it pretty easy to switch between the Celery and the Kubernetes Executor. We're going to be adding in Knative pretty soon as well (specifically to solve that problem with short-term tasks/other major benefits for scalability + speed). > - helm v3 is about to be released, and this may have a major impact on > charts (especially they can be de-centralized). I do not know if this > concern also the "stable" ones, but if so, it would make sense to host the > chart aside of airflow's code, isn't it? I imagine we would keep the helm chart in a separate repo the same way we are keeping the airflow custom controller in a separate repo. We can have versioned releases on the helm chart the same way we will have versioned releases of the docker image. On Mon, Oct 14, 2019 at 7:26 AM Jarek Potiuk <jarek.pot...@polidea.com> wrote: > Hello Gaetan! Nice to hear from you! > > On Mon, Oct 14, 2019 at 10:53 AM Gaetan Semet <gae...@xeberon.net> wrote: > > > Hi. > > > > I am Gaetan Semet (gsemet), the main maintainer of the stable/airflow > > chart. > > I am pretty thrilled by this conversation, and would be glad to see the > > chart switch to an official image. While the chart is pretty stable and > > already used in production by many, I have limited time to maintain it, > so > > I would be very happy to see it directly maintained by the community. Of > > course I will continue to help on it as much as possible. > > > > Maybe we can start a dedicated discussion only for the chart, I have a > few > > questions on how to proceeded: > > - who to put as new OWNERS of this chart > > > > For the governance I am happy to be added to the OWNERS list and I think > Daniel will be happy as well :). > > > > - do you want to take full "ownership" of it, if so, how to proceed ? How > > the governance of the chart should be managed? > > > I think eventually we might go into the Apache governance mode where anyone > from Apache Committers can contribute to the helm chart but then official > release of the helm chart needs to be voted by the PMC (but I will let PMC > to propose/decide on it) > > > > - the current implementation uses, like said, the "puckle" image that is > > quite good and stable, and many users are quite happy with it. Switching > to > > an official image will require to document the change quite exaustively, > > especially if some feature get lost in the process. > > > > I will make sure to review this. Once I get POC in the stage that all tests > pass (for now I still have mysql tests failing) I will review and point out > the differences between Puckel and the official image and will try to > either bring the missing parts in, or document the changes. > > > > - The current implementation use Celery executor, do you plan into > > switching completely to the Kubernetes Executor, or maintaining 2 > > configurations in the template files? Maybe a simple value for the > > executable would be enough, I don't know, I did not tried the kube > executor > > actually. To my understanding, the kube executor starts/stops each task > in > > a pod, which comes with major cost for each simple tasks. Would it be > > possible to keep the current Celery as default executor and let user > switch > > to the kube executor? > > > > I think we need Daniel to chime-in for the details about Helm - he is > also > working on AIP-25 Knative executor > < > https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-25+The+Knative+Executor > > > so > we might likely support three executors in the improved helm chart :). > > > > Just some other questions on AIP-26: > > - do you plan making alpine images as well? or minimal-ubuntu, to reduce > > the size as much as possible? > > > > I did some initial attempt to support alpine images - I started the work by > looking at the Astronomer images based on Alpine ( > https://github.com/astronomer/astronomer/tree/master/docker/airflow/1.10.5 > ) > but after some discussions with Daniel and Ash I abandoned it (and they > both support that). > > The main problem with Alpine images is that Alpine uses musl library for > C/C++ rather than glibc. I already had a number of problems with > installing/compiling packages required by Airflow (I spent a day trying to > make it work without using "edge" packages from Alpine). I did some > research and comparison of sizes and I came to the same conclusion as here: > https://pythonspeed.com/articles/base-image-python-docker-images/. I am > using *python-buster-slim* images now - smallest python, pure debian image > (no ubuntu specific changes) and their size is rather small. Not as small > as alpine images, but taking into account the overall size of dependencies > we have for Airflow, the difference between slim-based and alpine based > images is not compelling. Quoting the pythonspeed blog: "The size benefit > for Alpine isn’t even particularly compelling: the download size of > python:3.7-slim-buster is 60MB, and python:3.7-alpine is 34MB, and their > uncompressed on-disk size is 180MB and 100MB respectively." > > I've implemented other size optimisations in the PROD image - for example I > do not have NPM nor node_modules in the final production image - the > javascript is built in a separate stage and only the compiled javascript is > copied to the final image. The current size of the images are ( > https://cloud.docker.com/repository/docker/potiuk/airflow) > *master-python3.6 > = 410 MB*, *master-python3.5 = 408 MB*, *master-python3.7 = 387 MB*, so > difference vs. alpine/slim is < 20% of the size (and it might be less as > the dependencies might be bigger on alpine). Debian is for sure also more > future-proof in case we add new dependencies - for precisely the same > reason (musl vs. glibc support). > > I think overall it's not worth supporting multiple base images - as it will > add a lot of complexity to the Image/build process with rather limited > benefits. > > - helm v3 is about to be released, and this may have a major impact on > > charts (especially they can be de-centralized). I do not know if this > > concern also the "stable" ones, but if so, it would make sense to host > the > > chart aside of airflow's code, isn't it? > > > > Again - I think Daniel will be better person to comment on that :). > > > > > > So, tell you how could I help. > > > > I think reviews, comments in the PR and in the discussion here is the best > way to help. Also I think once we have the image more-or-less ready i will > ask for help with testing the image in various scenarios, so here I'd > appreciate your help here. > > > > > > Best Regards, > > Gaetan Semet > > > > > On Mon, Oct 14, 2019 at 8:42 AM Jarek Potiuk <jarek.pot...@polidea.com > > > > > wrote: > > > > > > > Issue created! https://github.com/helm/charts/issues/17933 . Thanks > > > > Jonathan for feedback and bringing this up! > > > > > > > > > -- > > Jarek Potiuk > Polidea <https://www.polidea.com/> | Principal Software Engineer > > M: +48 660 796 129 <+48660796129> > [image: Polidea] <https://www.polidea.com/> >