Problems with connecting Spark to Hive

Lars Selsaas Tue, 03 Jun 2014 10:52:00 -0700

Hi,

I've installed Spark 1.0.0 on a HDP 2.1


I moved the hive-site.xml file into the conf directory for Spark in an
attempt to connect Spark with my existing Hive.

Below is the full log from me starting Spark till I get the error. It seems
to be building the assembly with hive so that part should be good. Is the
Hive version HDP 2.1 not support or any other hints to what I'm doing wrong?


[root@sandbox bin]# ./spark-shell

Spark assembly has been built with Hive, including Datanucleus jars on
classpath

14/06/03 10:30:21 INFO SecurityManager: Using Spark's default log4j
profile: org/apache/spark/log4j-defaults.properties

14/06/03 10:30:21 INFO SecurityManager: Changing view acls to: root

14/06/03 10:30:21 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(root)

14/06/03 10:30:21 INFO HttpServer: Starting HTTP Server

Welcome to

      ____              __

     / __/__  ___ _____/ /__

    _\ \/ _ \/ _ `/ __/  '_/

   /___/ .__/\_,_/_/ /_/\_\   version 1.0.0

      /_/


Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java
1.7.0_45)

Type in expressions to have them evaluated.

Type :help for more information.

14/06/03 10:30:28 INFO SecurityManager: Changing view acls to: root

14/06/03 10:30:28 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(root)

14/06/03 10:30:29 INFO Slf4jLogger: Slf4jLogger started

14/06/03 10:30:29 INFO Remoting: Starting remoting

14/06/03 10:30:29 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sp...@sandbox.hortonworks.com:58176]

14/06/03 10:30:29 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://sp...@sandbox.hortonworks.com:58176]

14/06/03 10:30:29 INFO SparkEnv: Registering MapOutputTracker

14/06/03 10:30:29 INFO SparkEnv: Registering BlockManagerMaster

14/06/03 10:30:29 INFO DiskBlockManager: Created local directory at
/tmp/spark-local-20140603103029-75e9

14/06/03 10:30:29 INFO MemoryStore: MemoryStore started with capacity 294.9
MB.

14/06/03 10:30:29 INFO ConnectionManager: Bound socket to port 33645 with
id = ConnectionManagerId(sandbox.hortonworks.com,33645)

14/06/03 10:30:29 INFO BlockManagerMaster: Trying to register BlockManager

14/06/03 10:30:29 INFO BlockManagerInfo: Registering block manager
sandbox.hortonworks.com:33645 with 294.9 MB RAM

14/06/03 10:30:29 INFO BlockManagerMaster: Registered BlockManager

14/06/03 10:30:29 INFO HttpServer: Starting HTTP Server

14/06/03 10:30:29 INFO HttpBroadcast: Broadcast server started at
http://10.0.2.15:43286

14/06/03 10:30:29 INFO HttpFileServer: HTTP File server directory is
/tmp/spark-fc0d099f-16ca-48e3-8c49-6f809bf67449

14/06/03 10:30:29 INFO HttpServer: Starting HTTP Server

14/06/03 10:30:30 INFO SparkUI: Started SparkUI at
http://sandbox.hortonworks.com:4040

14/06/03 10:30:30 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

14/06/03 10:30:31 INFO Executor: Using REPL class URI:
http://10.0.2.15:34588

14/06/03 10:30:31 INFO SparkILoop: Created spark context..

Spark context available as sc.


scala> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

14/06/03 10:30:53 INFO deprecation: mapred.input.dir.recursive is
deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive

14/06/03 10:30:53 INFO deprecation: mapred.max.split.size is deprecated.
Instead, use mapreduce.input.fileinputformat.split.maxsize

14/06/03 10:30:53 INFO deprecation: mapred.min.split.size is deprecated.
Instead, use mapreduce.input.fileinputformat.split.minsize

14/06/03 10:30:53 INFO deprecation: mapred.min.split.size.per.rack is
deprecated. Instead, use
mapreduce.input.fileinputformat.split.minsize.per.rack

14/06/03 10:30:53 INFO deprecation: mapred.min.split.size.per.node is
deprecated. Instead, use
mapreduce.input.fileinputformat.split.minsize.per.node

14/06/03 10:30:53 INFO deprecation: mapred.reduce.tasks is deprecated.
Instead, use mapreduce.job.reduces

14/06/03 10:30:53 INFO deprecation:
mapred.reduce.tasks.speculative.execution is deprecated. Instead, use
mapreduce.reduce.speculative

java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator

at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:286)

at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:166)

at $iwC$$iwC$$iwC$$iwC.<init>(<console>:12)

at $iwC$$iwC$$iwC.<init>(<console>:17)

at $iwC$$iwC.<init>(<console>:19)

at $iwC.<init>(<console>:21)

at <init>(<console>:23)

at .<init>(<console>:27)

at .<clinit>(<console>)

at .<init>(<console>:7)

at .<clinit>(<console>)

at $print(<console>)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788)

at
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056)

at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614)

at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645)

at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609)

at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796)

at
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841)

at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753)

at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:601)

at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:608)

at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:611)

at
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:936)

at
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)

at
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884)

at
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)

at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884)

at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982)

at org.apache.spark.repl.Main$.main(Main.scala:31)

at org.apache.spark.repl.Main.main(Main.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator

at
org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthenticator(HiveUtils.java:368)

at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:278)

... 41 more

Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator

at
scala.tools.nsc.interpreter.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:83)

at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

at java.lang.Class.forName0(Native Method)

at java.lang.Class.forName(Class.java:270)

at
org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthenticator(HiveUtils.java:361)

... 42 more

-- 

Lars Selsaas

Data Engineer

Think Big Analytics <http://thinkbiganalytics.com>

lars.sels...@thinkbiganalytics.com

650-537-5321

Problems with connecting Spark to Hive

Reply via email to