[ https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=856366&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-856366 ]
ASF GitHub Bot logged work on HIVE-26400: ----------------------------------------- Author: ASF GitHub Bot Created on: 12/Apr/23 09:21 Start Date: 12/Apr/23 09:21 Worklog Time Spent: 10m Work Description: deniskuzZ commented on code in PR #3448: URL: https://github.com/apache/hive/pull/3448#discussion_r1163861025 ########## packaging/src/docker/README.md: ########## @@ -0,0 +1,131 @@ +### Introduction + +--- +Run Apache Hive inside docker container in pseudo-distributed mode, provide the following +- Quick-start/Debugging/Prepare a test env for Hive + + +### Quickstart + +--- +#### Build image +Apache Hive relies on Hadoop, Tez and some others to facilitate reading, writing, and managing large datasets. +The `build.sh` provides ways to build the image against specified version of the dependent, as well as build from source. + +##### Build from source +```shell +mvn clean package -pl packaging -DskipTests -Pdocker +``` +##### Build with specified version +There are some arguments to specify the component version: +```shell +-hadoop <hadoop version> +-tez <tez version> +-hive <hive version> +``` +If the version is not provided, it will read the version from current `pom.xml`: +`project.version`, `hadoop.version` and `tez.version` for Hive, Hadoop and Tez respectively. + +For example, the following command uses Hive 3.1.3, Hadoop `hadoop.version` and Tez `tez.version` to build the image, +```shell +./build.sh -hive 3.1.3 +``` +If the command does not specify the Hive version, it will use the local `apache-hive-${project.version}-bin.tar.gz`(will trigger a build if it doesn't exist), +together with Hadoop 3.1.0 and Tez 0.10.1 to build the image, +```shell +./build.sh -hadoop 3.1.0 -tez 0.10.1 +``` +After building successfully, we can get a Docker image named `apache/hive` by default, the image is tagged by the provided Hive version. + +#### Run services + +Before going further, we should define the environment variable `HIVE_VERSION` first. +For example, if `-hive 3.1.3` is specified to build the image, +```shell +export HIVE_VERSION=3.1.3 +``` +or assuming that you're relying on current `project.version` from pom.xml, +```shell +export HIVE_VERSION=$(mvn -f pom.xml -q help:evaluate -Dexpression=project.version -DforceStdout) +``` +- Metastore + +For a quick start, launch the Metastore with Derby, + ```shell + docker run -d -p 9083:9083 --env SERVICE_NAME=metastore --name metastore-standalone apache/hive:${HIVE_VERSION} + ``` + Everything would be lost when the service is down. In order to save the Hive table's schema and data, start the container with an external Postgres and Volume to keep them, + + ```shell + docker run -d -p 9083:9083 --env SERVICE_NAME=metastore \ + --env DB_DRIVER=postgres \ Review Comment: 👍 Issue Time Tracking ------------------- Worklog Id: (was: 856366) Time Spent: 13h 20m (was: 13h 10m) > Provide docker images for Hive > ------------------------------ > > Key: HIVE-26400 > URL: https://issues.apache.org/jira/browse/HIVE-26400 > Project: Hive > Issue Type: Sub-task > Components: Build Infrastructure > Reporter: Zhihua Deng > Assignee: Zhihua Deng > Priority: Blocker > Labels: hive-4.0.0-must, pull-request-available > Time Spent: 13h 20m > Remaining Estimate: 0h > > Make Apache Hive be able to run inside docker container in pseudo-distributed > mode, with MySQL/Derby as its back database, provide the following: > * Quick-start/Debugging/Prepare a test env for Hive; > * Tools to build target image with specified version of Hive and its > dependencies; > * Images can be used as the basis for the Kubernetes operator. -- This message was sent by Atlassian Jira (v8.20.10#820010)