Hi, I've installed Spark 1.0.0 on a HDP 2.1
I moved the hive-site.xml file into the conf directory for Spark in an attempt to connect Spark with my existing Hive. Below is the full log from me starting Spark till I get the error. It seems to be building the assembly with hive so that part should be good. Is the Hive version HDP 2.1 not support or any other hints to what I'm doing wrong? [root@sandbox bin]# ./spark-shell Spark assembly has been built with Hive, including Datanucleus jars on classpath 14/06/03 10:30:21 INFO SecurityManager: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 14/06/03 10:30:21 INFO SecurityManager: Changing view acls to: root 14/06/03 10:30:21 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root) 14/06/03 10:30:21 INFO HttpServer: Starting HTTP Server Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.0.0 /_/ Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_45) Type in expressions to have them evaluated. Type :help for more information. 14/06/03 10:30:28 INFO SecurityManager: Changing view acls to: root 14/06/03 10:30:28 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root) 14/06/03 10:30:29 INFO Slf4jLogger: Slf4jLogger started 14/06/03 10:30:29 INFO Remoting: Starting remoting 14/06/03 10:30:29 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sp...@sandbox.hortonworks.com:58176] 14/06/03 10:30:29 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sp...@sandbox.hortonworks.com:58176] 14/06/03 10:30:29 INFO SparkEnv: Registering MapOutputTracker 14/06/03 10:30:29 INFO SparkEnv: Registering BlockManagerMaster 14/06/03 10:30:29 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20140603103029-75e9 14/06/03 10:30:29 INFO MemoryStore: MemoryStore started with capacity 294.9 MB. 14/06/03 10:30:29 INFO ConnectionManager: Bound socket to port 33645 with id = ConnectionManagerId(sandbox.hortonworks.com,33645) 14/06/03 10:30:29 INFO BlockManagerMaster: Trying to register BlockManager 14/06/03 10:30:29 INFO BlockManagerInfo: Registering block manager sandbox.hortonworks.com:33645 with 294.9 MB RAM 14/06/03 10:30:29 INFO BlockManagerMaster: Registered BlockManager 14/06/03 10:30:29 INFO HttpServer: Starting HTTP Server 14/06/03 10:30:29 INFO HttpBroadcast: Broadcast server started at http://10.0.2.15:43286 14/06/03 10:30:29 INFO HttpFileServer: HTTP File server directory is /tmp/spark-fc0d099f-16ca-48e3-8c49-6f809bf67449 14/06/03 10:30:29 INFO HttpServer: Starting HTTP Server 14/06/03 10:30:30 INFO SparkUI: Started SparkUI at http://sandbox.hortonworks.com:4040 14/06/03 10:30:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/06/03 10:30:31 INFO Executor: Using REPL class URI: http://10.0.2.15:34588 14/06/03 10:30:31 INFO SparkILoop: Created spark context.. Spark context available as sc. scala> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) 14/06/03 10:30:53 INFO deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 14/06/03 10:30:53 INFO deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 14/06/03 10:30:53 INFO deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 14/06/03 10:30:53 INFO deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 14/06/03 10:30:53 INFO deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 14/06/03 10:30:53 INFO deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 14/06/03 10:30:53 INFO deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:286) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:166) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:12) at $iwC$$iwC$$iwC.<init>(<console>:17) at $iwC$$iwC.<init>(<console>:19) at $iwC.<init>(<console>:21) at <init>(<console>:23) at .<init>(<console>:27) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:841) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:601) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:608) at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:611) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:936) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:884) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthenticator(HiveUtils.java:368) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:278) ... 41 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator at scala.tools.nsc.interpreter.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:83) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthenticator(HiveUtils.java:361) ... 42 more -- Lars Selsaas Data Engineer Think Big Analytics <http://thinkbiganalytics.com> lars.sels...@thinkbiganalytics.com 650-537-5321