I finally solved this problem. The org.apache.hadoop.mapreduce.JobContext is a class in hadoop < 2.0 and is an interface in hadoop >= 2.0. I have to use a spark build for hadoop v1.
So spark-sql seems fine. But, the thriftserver does not work with my config! Here is my spark-env.sh: #!/usr/bin/env bash export JAVA_HOME=/usr/lib/jvm/java-7-oracle export SPARK_HOME=/home/jererc/spark export SPARK_MASTER_IP=10.194.30.2 export SPARK_WORKER_CORES=2 export SPARK_WORKER_INSTANCES=4 export SPARK_MASTER_PORT=7077 export SPARK_WORKER_MEMORY=4g export MASTER=spark://${SPARK_MASTER_IP}:${SPARK_MASTER_PORT} export CLASSPATH=$(echo ${SPARK_HOME}/lib/*.jar | sed 's/ /:/g'):$CLASSPATH export SPARK_CLASSPATH=$CLASSPATH Here is the output: root@cdb-01:~/spark# ./sbin/start-thriftserver.sh --master spark://10.194.30.2:7077 --driver-class-path $CLASSPATH --hiveconf hive.server2.thrift.bind.host 0.0.0.0 --hiveconf hive.server2.thrift.port 10000 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/jererc/spark/lib/spark-assembly-1.1.0-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/jererc/spark/lib/spark-examples-1.1.0-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 14/11/20 14:55:35 INFO thriftserver.HiveThriftServer2: Starting SparkContext 14/11/20 14:55:35 WARN spark.SparkConf: SPARK_CLASSPATH was detected (set to '/home/jererc/spark/lib/cassandra-all-1.2.9.jar:/home/jererc/spark/lib/cassandra-thrift-1.2.9.jar:/home/jererc/spark/lib/datanucleus-api-jdo-3.2.1.jar:/home/jererc/spark/lib/datanucleus-core-3.2.2.jar:/home/jererc/spark/lib/datanucleus-rdbms-3.2.1.jar:/home/jererc/spark/lib/hadoop-core-0.20.205.0.jar:/home/jererc/spark/lib/hive-cassandra-1.2.9.jar:/home/jererc/spark/lib/mysql-connector-java.jar:/home/jererc/spark/lib/spark-assembly-1.1.0-hadoop1.0.4.jar:/home/jererc/spark/lib/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar:/home/jererc/spark/lib/spark-examples-1.1.0-hadoop1.0.4.jar:'). This is deprecated in Spark 1.0+. Please instead use: - ./spark-submit with --driver-class-path to augment the driver classpath - spark.executor.extraClassPath to augment the executor classpath 14/11/20 14:55:35 WARN spark.SparkConf: Setting 'spark.executor.extraClassPath' to '/home/jererc/spark/lib/cassandra-all-1.2.9.jar:/home/jererc/spark/lib/cassandra-thrift-1.2.9.jar:/home/jererc/spark/lib/datanucleus-api-jdo-3.2.1.jar:/home/jererc/spark/lib/datanucleus-core-3.2.2.jar:/home/jererc/spark/lib/datanucleus-rdbms-3.2.1.jar:/home/jererc/spark/lib/hadoop-core-0.20.205.0.jar:/home/jererc/spark/lib/hive-cassandra-1.2.9.jar:/home/jererc/spark/lib/mysql-connector-java.jar:/home/jererc/spark/lib/spark-assembly-1.1.0-hadoop1.0.4.jar:/home/jererc/spark/lib/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar:/home/jererc/spark/lib/spark-examples-1.1.0-hadoop1.0.4.jar:' as a work-around. Exception in thread "main" org.apache.spark.SparkException: Found both spark.driver.extraClassPath and SPARK_CLASSPATH. Use only the former. at org.apache.spark.SparkConf$$anonfun$validateSettings$5$$anonfun$apply$6.apply(SparkConf.scala:300) at org.apache.spark.SparkConf$$anonfun$validateSettings$5$$anonfun$apply$6.apply(SparkConf.scala:298) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.SparkConf$$anonfun$validateSettings$5.apply(SparkConf.scala:298) at org.apache.spark.SparkConf$$anonfun$validateSettings$5.apply(SparkConf.scala:286) at scala.Option.foreach(Option.scala:236) at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:286) at org.apache.spark.SparkContext.<init>(SparkContext.scala:158) at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:36) at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:57) at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) And if I don't use SPARK_CLASSPATH then spark-sql does not work. I tried ADD_JARS without much success. What's the best way to set the CLASSPATH and the jars? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/tableau-spark-sql-cassandra-tp19282p19379.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org