To clarify are you running Hive and using Spark as the execution engine (as opposed to default Hive execution engine MapReduce)?
Mich Talebzadeh Sybase ASE 15 Gold Medal Award 2008 A Winning Strategy: Running the most Critical Financial Data on ASE 15 <http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908 .pdf> http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908. pdf Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4 Publications due shortly: Complex Event Processing in Heterogeneous Environments, ISBN: 978-0-9563693-3-8 Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility. From: Link Qian [mailto:fastupl...@outlook.com] Sent: 30 November 2015 13:21 To: user@hive.apache.org Subject: Problem with getting start of Hive on Spark Hello, Following the Hive wiki page, https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+St arted, I got a several fails that execute HQL based on Spark engine with yarn. I have hadoop-2.6.2, yarn-2.6.2 and Spark-1.5.2. The fails got either spark-1.5.2-hadoop2.6 distribution version or spark-1.5.2-without-hive customer compiler version with instruction on that wiki page. Hive cli submits spark job but the job runs a short time and RM web app shows the job is successfully. but hive cli show the job fails. Here is a snippet of hive cli debug log. any suggestion? 15/11/30 07:31:36 [main]: INFO status.SparkJobMonitor: state = SENT 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO yarn.Client: Application report for application_1448886638370_0001 (state: RUNNING) 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO yarn.Client: 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: client token: N/A 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: diagnostics: N/A 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: ApplicationMaster host: 192.168.1.12 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: ApplicationMaster RPC port: 0 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: queue: default 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: start time: 1448886649489 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: final status: UNDEFINED 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: tracking URL: http://namenode.localdomain:8088/proxy/application_1448886638370_0001/ 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: user: hadoop 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO cluster.YarnClientSchedulerBackend: Application application_1448886638370_0001 has started running. 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51326. 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO netty.NettyBlockTransferService: Server created on 51326 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.1.10:51326 with 66.8 MB RAM, BlockManagerId(driver, 192.168.1.10, 51326) 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO storage.BlockManagerMaster: Registered BlockManager state = SENT 15/11/30 07:31:37 [main]: INFO status.SparkJobMonitor: state = SENT 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$MessageHeader (5 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type java.lang.Integer (2 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.RpcDispatcher: [ClientProtocol] Received RPC message: type=REPLY id=0 payload=java.lang.Integer 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO spark.SparkContext: Added JAR file:/home/hadoop/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar at http://192.168.1.10:41276/jars/hive-exec-1.2.1.jar with timestamp 1448886697575 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$MessageHeader (5 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$NullMessage (2 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.RpcDispatcher: [ClientProtocol] Received RPC message: type=REPLY id=1 payload=org.apache.hive.spark.client.rpc.Rpc$NullMessage 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO rpc.RpcDispatcher: [DriverProtocol] Closing channel due to exception in pipeline (java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job). 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type org.apache.hive.spark.client.rpc.Rpc$MessageHeader (5 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.KryoMessageCodec: Decoded message of type java.lang.String (3720 bytes) 15/11/30 07:31:37 [RPC-Handler-3]: DEBUG rpc.RpcDispatcher: [ClientProtocol] Received RPC message: type=ERROR id=2 payload=java.lang.String 15/11/30 07:31:37 [RPC-Handler-3]: WARN rpc.RpcDispatcher: Received error message:io.netty.handler.codec.DecoderException: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder. java:358) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder .java:230) at io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.jav a:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractCha nnelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChann elHandlerContext.java:294) at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHand lerAdapter.java:86) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractCha nnelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChann elHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeli ne.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioBy teChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop. java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventEx ecutor.java:111) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NoClassDefFoundError: org/apache/hive/spark/client/Job at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:760) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:411) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName (DefaultClassResolver.java:136) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClas s(DefaultClassResolver.java:115) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(Objec tField.java:99) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(F ieldSerializer.java:507) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java: 776) at org.apache.hive.spark.client.rpc.KryoMessageCodec.decode(KryoMessageCodec.ja va:96) at io.netty.handler.codec.ByteToMessageCodec$1.decode(ByteToMessageCodec.java:4 2) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder. java:327) ... 15 more Caused by: java.lang.ClassNotFoundException: org.apache.hive.spark.client.Job at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 39 more . 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 WARN client.RemoteDriver: Shutting down driver because RPC channel was closed. 15/11/30 07:31:37 [stderr-redir-1]: INFO client.SparkClientImpl: 15/11/30 07:31:37 INFO client.RemoteDriver: Shutting down remote driver. 15/11/30 07:31:37 [RPC-Handler-3]: WARN client.SparkClientImpl: Client RPC channel closed unexpectedly. best regards, Link Qian