[ https://issues.apache.org/jira/browse/HIVE-26400?focusedWorklogId=795489&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-795489 ]
ASF GitHub Bot logged work on HIVE-26400: ----------------------------------------- Author: ASF GitHub Bot Created on: 27/Jul/22 01:54 Start Date: 27/Jul/22 01:54 Worklog Time Spent: 10m Work Description: achennagiri commented on code in PR #3448: URL: https://github.com/apache/hive/pull/3448#discussion_r930548578 ########## dev-support/docker/README.md: ########## @@ -0,0 +1,125 @@ +### Introduction + +--- +Run Apache Hive inside docker container in pseudo-distributed mode, with MySQL as its back database. +Provide the following +- Quick-start/Debugging/Prepare a test env for Hive +- Images can be used as the basis for the Kubernetes operator + +### Overview + +--- +#### Files +- docker-compose.yml: Docker compose file +- Dockerfile-*, scripts/docker-entrypoint.sh: Instructions to build images. +- conf/hiveserver2-site.xml: Configuration for HiveServer2 +- conf/metastore-site.xml: Configuration for Hive Metastore +- deploy.sh Entrance to build images and run them. + +### Quickstart + +--- +#### Build images +Hive relies on Hadoop, Tez and MySQL to work correctly. Up to now, there are so many versions that these dependents have been released, including Hive itself. +Providing a way to build Hive against a specified version of the dependent sounds reasonable. There are some build args for this purpose, as listed below: +```shell +--hadoop <hadoop version> +--tez <tez version> +--hive <hive version> +``` +If the versions are not given during build, then it will read the version info from project top `pom.xml`: project.version, hadoop.version, tez.version, +these are assigned to hive version, hadoop version, tez version accordingly. There are two different ways to build the image, the key difference is whether we +have specified the Hive version or not. +- Build remotely + +The Hive version is picked up by `--hive <hive version>`, for example: +```shell +sh deploy.sh --hive 3.1.3 +``` +This command will pull the Hive tar ball from Apache to local, together with Hadoop and Tez, while those two versions are defined in top `pom.xml` +to build the image. + +- Build locally + +If the Hive version is not specified, then it will search the file: `packaging/target/apache-hive-${project.version}-bin.tar.gz` to make sure it exists, otherwise it will +stop building. +```shell +sh deploy.sh --hadoop 3.1.0 --tez 0.10.1 +``` +The above example will use the local `apache-hive-${project.version}-bin.tar.gz`, Hadoop 3.1.0 and Tez 0.10.1 to build the target image. + +#### Run services + +- Launch a single standalone Metastore + +If you want to just test Metastore or play around with it, execute the following: +```shell +sh deploy.sh --metastore +``` +or run with docker if Metastore image is already here. +```shell +docker run --name metastore-standalone hive:metastore-$HIVE_VERSION +``` + +- Launch a single standalone HiveServer2 for a quick start + +The HiveServer2 will be started with an embedded Metastore. To launch it, execute the following: +```shell +sh deploy.sh --hiveserver2 +``` +Or if the image for HiveServer2 has been built successfully, simply running with: +```shell +docker run --name hiveserver2-standalone hive:hiveserver2-$HIVE_VERSION +``` +Please pay attention to that the data of the HiveServer2 would be lost between container restarts. +In order to save the data, try to bring up the Hive with the following way. + +- Launch a cluster with HiveServer2, Metastore and MySQL as its back database. + +To save data between container restarts, we use Docker's volume to persist data to the local disk. Just by executing: Review Comment: Thank you! This work is really cool and helpful. Thank you for working on this! Issue Time Tracking ------------------- Worklog Id: (was: 795489) Time Spent: 2h 40m (was: 2.5h) > Provide a self-contained docker > ------------------------------- > > Key: HIVE-26400 > URL: https://issues.apache.org/jira/browse/HIVE-26400 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure > Reporter: Zhihua Deng > Assignee: Zhihua Deng > Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)