The docker image `pulsar-all` is a convenience image that is created on top of the base `pulsar` image, including all the Pulsar IO connectors as well as the tiered storage offloaders.
The Dockerfile for `pulsar-all` can be found here: https://github.com/apache/pulsar/blob/master/docker/pulsar-all/Dockerfile The resulting image is very big: ``` apachepulsar/pulsar-all 3.1.2 3d1aa250bf6c 2 months ago 3.68GB ``` This poses a challenge in many ways: 1. Our CI pipeline needs to build these images and cache them across different stages of the pipeline 2. It takes a lot of time for release managers to build and push these images to Docker Hub 3. Users using this image in production see very long download times, something that can affect the availability of the system (eg: more chances of a 2nd broker to crash if a restart takes a very long time). 4. It's very unlikely that one user will require all the connectors, most likely, it would use just 2-3 of them. The problem is that `pulsar-all` was introduced at a time when there were ~3 Pulsar IO connectors. Right now we do have 35 connectors, with a 1.9 GB total size. The proposal here is to drop this image altogether. Users will be able to construct their own targeted images in a very simple way: ``` FROM apachepulsar/pulsar:latest RUN mkdir -p connectors && \ cd connectors && \ wget https://downloads.apache.org/pulsar/pulsar-3.2.0/connectors/pulsar-io-elastic-search-3.2.0.nar ``` ### Pulsar Functions Python Runtime In order to support Python functions runtime, we have been including the Pulsar base image with quite a bit of dependencies, from `pulsar-client` Python SDK, to gRPC which is quite a heavy package with many transitive dependencies. Given that the vast majority would be using the `pulsar` base image to run brokers and not python functions, it would make sense to split the Python support into a different image, like `pulsar-functions-python`, which extends from the base image and adds all the needed Python dependencies. This way it will be very easy for users to select the appropriate image and we wouldn't be carrying a big amount of useless Python dependencies to users who don't need them. What are people's opinions with respect to this? Matteo -- Matteo Merli <matteo.me...@gmail.com>