[ 
https://issues.apache.org/jira/browse/HIVE-28424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870357#comment-17870357
 ] 

Qiheng He commented on HIVE-28424:
----------------------------------

- [~dengzh] Thanks for your answer! I verified in 
https://github.com/linghengqian/hivesever2-v400-sd-test/pull/1 that by changing 
*hive.server2.thrift.port*, I can change the port mapping of the docker 
container, so that applications outside the docker network can obtain the 
correct connection information of the jdbcUrl of hiveserver2 through the 
zookeeper server.
- Setting *hive.server2.thrift.bind.host* to *127.0.0.1* is relatively useless 
because the default *network_mode* of the docker container is not *host*. 
Accessing *localhost:10000* from outside the docker network obviously requires 
changing the *network_mode* of the docker container to *host*. However, 
directly setting the hostname of the docker container to *127.0.0.1* allows 
normal access from outside the docker network. This makes sense from my 
perspective.
- It would be nice if the README of https://hub.docker.com/r/apache/hive 
mentioned that users can change the default *10000* port in the container by 
modifying *hive.server2.thrift.port*, which would obviously help with the 
integration of external applications. And add a note to the README of Docker 
Hub, saying `If wanting to access HiveServer2 running locally on host machine 
(not in Docker container), Linux users may use --net host instead of exposing 
the port`. For Windows and Mac users, consider 
https://stackoverflow.com/questions/31324981, but I don't have an environment 
other than Linux to verify it.
- What do you think? Or is it actually difficult to modify the README on Docker 
Hub, and the current issue can be closed?

> The Docker Image of HiveServer2 should provide an env variable defining the 
> `hostname:port` passed into the znode
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-28424
>                 URL: https://issues.apache.org/jira/browse/HIVE-28424
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Qiheng He
>            Priority: Major
>
> * The Docker Image of HiveServer2 should provide an environment variable 
> defining the *hostname:port* passed into the znode.
>  * This requirement may seem a bit strange at first glance, but it requires 
> the introduction of a small service orchestration scenario related to 
> [https://github.com/dbeaver/dbeaver/issues/22777] .
>  * For the Docker Image of HiveServer2 on {*}apache/hive:4.0.0{*}, if I need 
> to enable Zookeeper Service Discovery, I apparently need to overwrite the 
> *hive-site.xml* in the Docker Image of {*}apache/hive:4.0.0{*}. I tested what 
> needs to be done to achieve this at 
> [https://github.com/linghengqian/hivesever2-v400-sd-test] . First I need to 
> define a docker-compose file to pull in the zookeeper server.
> {code:bash}
> services:
>   zookeeper-server:
>     image: zookeeper:3.9.2-jre-17
>     restart: always
>     ports:
>       - "2181:2181"
>   hive-server2:
>     image: apache/hive:4.0.0
>     restart: always
>     hostname: '127.0.0.1'
>     depends_on:
>       zookeeper-server:
>         condition: service_started
>     environment:
>       SERVICE_NAME: hiveserver2
>       HIVE_CUSTOM_CONF_DIR: /hive_custom_conf
>     ports:
>       - "10000:10000"
>       - "10002:10002"
>     volumes:
>       - ./hive-custom-conf:/hive_custom_conf 
> {code}
>  - Setting the hostname of the hive-server2 docker container to *127.0.0.1* 
> already compromises the local docker network. This is because (HiveServer2 
> hostname + :10000) is always passed to the znode in the zookeeper server, 
> which cannot be changed externally. Generally, the znode node at 
> */hiveserver2/serverUri=localhost:10000;version=4.0.0;sequence=0000000000* 
> has the content 
> *hive.server2.instance.uri=localhost:10000;hive.server2.authentication=NONE;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=localhost;hive.server2.thrift.port=10000;hive.server2.use.SSL=false*
>  . This can be observed from the zookeeper ui on the web by deploying a 
> Docker container called *elkozmon/zoonavigator:1.1.3* .
> - At this point, I also need to mount a *hive-site.xml* into the Docker Image 
> of HiveServer2. Most of the content here is repeated with 
> https://github.com/apache/hive/blob/rel/release-4.0.0/packaging/src/docker/conf/hive-site.xml,
>  but since *hive-site.xml* does not seem to exist in multiple copies, I can 
> only repeat the definition.
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <configuration>
>     <property>
>         <name>hive.server2.enable.doAs</name>
>         <value>false</value>
>     </property>
>     <property>
>         <name>hive.tez.exec.inplace.progress</name>
>         <value>false</value>
>     </property>
>     <property>
>         <name>hive.tez.exec.print.summary</name>
>         <value>true</value>
>     </property>
>     <property>
>         <name>hive.exec.scratchdir</name>
>         <value>/opt/hive/scratch_dir</value>
>     </property>
>     <property>
>         <name>hive.user.install.directory</name>
>         <value>/opt/hive/install_dir</value>
>     </property>
>     <property>
>         <name>tez.runtime.optimize.local.fetch</name>
>         <value>true</value>
>     </property>
>     <property>
>         <name>hive.exec.submit.local.task.via.child</name>
>         <value>false</value>
>     </property>
>     <property>
>         <name>mapreduce.framework.name</name>
>         <value>local</value>
>     </property>
>     <property>
>         <name>tez.local.mode</name>
>         <value>true</value>
>     </property>
>     <property>
>         <name>hive.metastore.warehouse.dir</name>
>         <value>/opt/hive/data/warehouse</value>
>     </property>
>     <property>
>         <name>metastore.metastore.event.db.notification.api.auth</name>
>         <value>false</value>
>     </property>
>     <property>
>         <name>hive.server2.support.dynamic.service.discovery</name>
>         <value>true</value>
>     </property>
>     <property>
>         <name>hive.zookeeper.quorum</name>
>         <value>zookeeper-server:2181</value>
>     </property>
> </configuration>
> {code}
> - At this point, outside of Docker Compose's Network, I can connect to the 
> deployed HiveServer2 in dbeaver via the jdbcUrl of 
> *jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;*.
>  
> - But if the docker compose file is defined like this. I only changed the 
> hostname of both containers in the same docker network.
> {code:bash}
> services:
>   zookeeper-server:
>     image: zookeeper:3.9.2-jre-17
>     hostname: 'zookeeper-server'
>     restart: always
>     ports:
>       - "2181:2181"
>   hive-server2:
>     image: apache/hive:4.0.0
>     restart: always
>     hostname: 'server2.hive.com'
>     depends_on:
>       zookeeper-server:
>         condition: service_started
>     environment:
>       SERVICE_NAME: hiveserver2
>       HIVE_CUSTOM_CONF_DIR: /hive_custom_conf
>     ports:
>       - "10000:10000"
>       - "10002:10002"
>     volumes:
>       - ./hive-custom-conf:/hive_custom_conf 
> {code}
> - Apparently, using the jdbcUrl of 
> *jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;*
>  still connects to zookeeper, but not to HiveServer2. Because at this time in 
> the zookeeper server, there is only the znode 
> */hiveserver2/serverUri=server2.hive.com:10000;version=4.0.0;sequence=0000000000*,
>  and its content is 
> *hive.server2.instance.uri=server2.hive.com:10000;hive.server2.authentication=NONE;hive.server2.transport.mode=binary;hive.server2.thrift.sasl.qop=auth;hive.server2.thrift.bind.host=server2.hive.com;hive.server2.thrift.port=10000;hive.server2.use.SSL=false*.
>  And *server2.hive.com:10000* is not accessible outside the docker network, 
> which actually affects the local debugging experience.
> {code:bash}
> com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to 
> initialize pool: Could not open client transport for any of the Server URI's 
> in ZooKeeper: Socket is closed by peer.
>       at 
> com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:596)
>       at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:582)
>       at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115)
>       at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81)
>       at com.lingh.HiveTest.test(HiveTest.java:20)
>       at java.base/java.lang.reflect.Method.invoke(Method.java:580)
>       at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
>       at java.base/java.util.ArrayList.forEach(ArrayList.java:1597)
> Caused by: java.sql.SQLException: Could not open client transport for any of 
> the Server URI's in ZooKeeper: Socket is closed by peer.
>       at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:420)
>       at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:285)
>       at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:94)
>       at 
> com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:121)
>       at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:364)
>       at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206)
>       at 
> com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:476)
>       at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:561)
>       ... 6 more
> Caused by: org.apache.hive.org.apache.thrift.transport.TTransportException: 
> Socket is closed by peer.
>       at 
> org.apache.hive.org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:184)
>       at 
> org.apache.hive.org.apache.thrift.transport.TTransport.readAll(TTransport.java:109)
>       at 
> org.apache.hive.org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:151)
>       at 
> org.apache.hive.org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:272)
>       at 
> org.apache.hive.org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39)
>       at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:512)
>       at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:382)
>       ... 13 more
> {code}
> - I don't seem to see any way in the documentation to change the hiveserver2 
> hostname and port passed into the zookeeper node for HiveServer2 in the 
> Docker Image. It would be nice if there was an easier way to change the 
> hiveserver2 hostname and port passed into the zookeeper node, such as giving 
> the docker image an environment variable.
> - I have set up a small unit test at 
> https://github.com/linghengqian/hivesever2-v400-sd-test for testing, and the 
> instructions for running are in the README.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to