Hello Daniel Vanko, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/22583

to look at the new patch set (#5).

Change subject: IMPALA-13825: Extend Docker container build to custom base 
images
......................................................................

IMPALA-13825: Extend Docker container build to custom base images

Downstream system vendors, users and customers have lately expressed
interest in consuming Impala in containerized forms, taking advantage of
various specialized, hardened container base image offerings, like
container offerings based on the Wolfi project by Chainguard;
see: https://github.com/wolfi-dev.

This patch enables Impala container images to be built on top of custom
base images, and adds an implementation example that uses the publicly
available Wolfi base image.

Building a customized Docker image follows a hybrid approach. Instead of
replicating the complete Impala build process inside a Wolfi container
for a fully native binary build, it relies on an existing build platform
that is compatible with the binary packages available inside the custom
container image. For Wolfi the Impala binaries are supplied by the
Red Hat 9 build of Impala. This is made possible by the fact that major
library dependencies of Impala have the same versions on Wolfi OS and
Red Hat 9, so binaries built on Red Hat 9 can be run on Wolfi
with no changes.

The binaries produced by the regular build process are then installed
into a Docker image built on top of an explicitly specified custom base
image. The selection of a custom base image is controlled by two
environment variables:
- USE_CUSTOM_IMPALA_BASE_IMAGE (boolean):
  If set to 'true', triggers the use of  the custom image.
  When set to 'false' or left unspecified, the Docker base image is
  selected by the existing logic of matching the build platform's
  operating system.
- IMPALA_CUSTOM_DOCKER_BASE (string): specifies the URI of the base image

These environment variables can be overridden from the environment,
from impala-config-branch.sh, or impala-config-local.sh.
They are reported at the end of bin/impala-config.sh where important
environment variables are listed. They are also added to the list of
variables in bin/jenkins/dockerized-impala-preserve-vars.py to ensure
that they can be used in the context of Jenkins jobs as well.

The unified script that installs Impala's required dependencies into the
container image is extended for Wolfi to handle APK packages.

A new script is added to install Bash in the Docker image if it is
missing. Impala build scripts (including the scripts used during Docker
image builds) as well as container startup scripts require Bash,
but minimal container base images usually omit it, favoring a smaller
alternative.

To improve the debugging experience for a containerized Impala
minicluster, the minicluster starter script bin/start-impala-cluster.py
is extended with the following features:
- synchronizes every launched  container's timezone to the host.
  This is done primarily by injecting the TZ environment variable into
  the container with the name of the timezone used on the host. This is
  taken either from the host's TZ variable (if set), or from the host's
  /etc/localtime symlink, checking the name of the timezone file it
  points to. In case /etc/localtime is not a symlink (and TZ is not set
  on the host), the host's /etc/localtime file is mounted into the
  container.
- sets up a directory for each container to collect the Java VMs error
  files (hs_err_pidNNNN.log) from the containers.
- adds the --mount_sources command line parameter, which mounts the
  complete $IMPALA_HOME subtree into the container at
  /opt/impala/sources to make source code available inside the container
  for easier debugging.

Tested by running core-mode tests in the following environments:
- Regular run (impalad running natively on the platform) on Ubuntu 20.04
- Regular run on Rocky Linux 9.2
- Dockerised run (impalad instances running in their individual
  containers) using Ubuntu 20.04 containers
- Dockerised run (impalad instances running in their individual
  containers) using Rocky Linux 9.2 containers
- Dockerised run (impalad instances running in their individual
  containers) using Wolfi's wolfi-base containers

Change-Id: Ia5e39f399664fe66f3774caa316ed5d4df24befc
---
M bin/impala-config.sh
M bin/jenkins/dockerized-impala-bootstrap-and-test.sh
M bin/start-impala-cluster.py
M docker/CMakeLists.txt
M docker/daemon_entrypoint.sh
M docker/docker-build.sh
M docker/impala_base/Dockerfile
M docker/impala_profile_tool/Dockerfile
A docker/install_bash_if_needed.sh
M docker/install_os_packages.sh
M docker/setup_build_context.py
M tests/common/impala_connection.py
12 files changed, 297 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/22583/5
--
To view, visit http://gerrit.cloudera.org:8080/22583
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5e39f399664fe66f3774caa316ed5d4df24befc
Gerrit-Change-Number: 22583
Gerrit-PatchSet: 5
Gerrit-Owner: Laszlo Gaal <laszlo.g...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Vanko <dva...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <laszlo.g...@cloudera.com>

Reply via email to