Nikita Ilyushkin created HIVE-22509: ---------------------------------------
Summary: LLAP with YARN services: "Error: Could not find or load main class org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon" Key: HIVE-22509 URL: https://issues.apache.org/jira/browse/HIVE-22509 Project: Hive Issue Type: Bug Components: llap Affects Versions: 3.1.1 Environment: We use self-builded packages of Hadoop (3.1.2) and Hive (3.1.1) based on Bigtop. Hosts in use: {code:java} # cat /etc/redhat-release CentOS Linux release 7.4.1708 (Core) {code} Reporter: Nikita Ilyushkin We have pretty basic installation of Hadoop, Hive and Zookeeper and need to use LLAP with YARN services - because as far as I can judge Slider is dead and YARN services is generic mechanism for such jobs as LLAP. In accordance with [https://hadoop.apache.org/docs/r3.1.2/hadoop-yarn/hadoop-yarn-site/yarn-service/Overview.html] I added {code:java} hadoop.registry.zk.quorum: <ACTUAL QUORUM> {code} to core-site.xml and {code:java} yarn.webapp.api-service.enable: True {code} to yarn-site.xml. This enabled me to run simple example from this page. Next under hive user I issued (without --auxhbase=false this fails): {code:java} /usr/lib/hive/bin/hive --service llap --name llaptest --instances 2 --size 2g --auxhbase=false {code} which gave me: {code:java} -bash-4.2$ ls -l llap-yarn-18Nov2019/ total 116932 -rw-rw-r--. 1 hive hive 119725946 Nov 18 10:00 llap-18Nov2019.tar.gz -rwx------. 1 hive hive 249 Nov 18 10:00 run.sh drwx------. 5 hive hive 88 Nov 18 10:00 test -rw-rw-r--. 1 hive hive 1777 Nov 18 11:51 Yarnfile {code} and run.sh started YARN service. The problem is: AM for LLAP is started, but containers of an application fail perpetually. I can see this by logs and RM UI - 1-2 containers spawn in a second. Logs showed this (hostname is replaced): {code:java} cat /var/log/hadoop-yarn/userlogs/application_1574064939102_0006/container_1574064939102_0006_01_000002/llap-daemon-hive-hostname.out ... + exec /usr/lib/jvm/jre-openjdk/bin/java -Dproc_llapdaemon -Xms4096m -Xmx4096m -Dhttp.maxConnections=5 -server -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+PrintGCDetails -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=100M -XX:+PrintGCDateStamps -Xloggc:/var/log/hadoop-yarn/userlogs/application_1574064939102_0006/container_1574064939102_0006_01_000002/gc_2019-11-18-15.log -XX:+UseParallelGC -Djava.io.tmpdir=/srv/hadoop-yarn/nm-local/usercache/hive/appcache/application_1574064939102_0006/container_1574064939102_0006_01_000002/tmp/ -Dlog4j.configurationFile=llap-daemon-log4j2.properties -Dllap.daemon.log.dir=/var/log/hadoop-yarn/userlogs/application_1574064939102_0006/container_1574064939102_0006_01_000002 -Dllap.daemon.log.file=llap-daemon-hive-hostname.log -Dllap.daemon.root.logger=query-routing -Dllap.daemon.log.level=INFO -classpath '/srv/hadoop-yarn/nm-local/usercache/hive/appcache/application_1574064939102_0006/container_1574064939102_0006_01_000002/lib/conf/:/srv/hadoop-yarn/nm-local/usercache/hive/appcache/application_1574064939102_0006/container_1574064939102_0006_01_000002/lib//lib/*:/srv/hadoop-yarn/nm-local/usercache/hive/appcache/application_1574064939102_0006/container_1574064939102_0006_01_000002/lib//lib/tez/*:/srv/hadoop-yarn/nm-local/usercache/hive/appcache/application_1574064939102_0006/container_1574064939102_0006_01_000002/lib//lib/udfs/*:.:/srv/hadoop-yarn/nm-local/usercache/hive/appcache/application_1574064939102_0006/container_1574064939102_0006_01_000002/lib/lib/*.jar' org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon Error: Could not find or load main class org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon ... {code} Analyzing this led me to conclusion, that /srv/hadoop-yarn/nm-local/usercache/hive/appcache/application_1574064939102_0006/container_1574064939102_0006_01_000002/lib//lib/* actually contains /usr/lib/hive/lib/hive-llap-server-3.1.1.jar with org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon class. Then I tried to do this: {code:java} # java -Dproc_llapdaemon -classpath '/usr/lib/hive/lib/*:/etc/hive/conf/*' 'org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon' Error: Could not find or load main class org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon # java -Dproc_llapdaemon -classpath /usr/lib/hive/lib/hive-llap-server-3.1.1.jar 'org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon' Error: Could not find or load main class org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon {code} Which puzzled me even more. Please help me to start LLAP with YARN services correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)