I finally solved this problem.
The org.apache.hadoop.mapreduce.JobContext is a class in hadoop < 2.0 and is
an interface in hadoop >= 2.0.
I have to use a spark build for hadoop v1.

So spark-sql seems fine. 
But, the thriftserver does not work with my config!

Here is my spark-env.sh:

#!/usr/bin/env bash
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
export SPARK_HOME=/home/jererc/spark
export SPARK_MASTER_IP=10.194.30.2
export SPARK_WORKER_CORES=2
export SPARK_WORKER_INSTANCES=4
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_MEMORY=4g
export MASTER=spark://${SPARK_MASTER_IP}:${SPARK_MASTER_PORT}
export CLASSPATH=$(echo ${SPARK_HOME}/lib/*.jar | sed 's/ /:/g'):$CLASSPATH
export SPARK_CLASSPATH=$CLASSPATH

Here is the output:

root@cdb-01:~/spark# ./sbin/start-thriftserver.sh --master
spark://10.194.30.2:7077 --driver-class-path $CLASSPATH --hiveconf
hive.server2.thrift.bind.host 0.0.0.0 --hiveconf hive.server2.thrift.port
10000
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/home/jererc/spark/lib/spark-assembly-1.1.0-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/home/jererc/spark/lib/spark-examples-1.1.0-hadoop1.0.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/11/20 14:55:35 INFO thriftserver.HiveThriftServer2: Starting SparkContext
14/11/20 14:55:35 WARN spark.SparkConf:
SPARK_CLASSPATH was detected (set to
'/home/jererc/spark/lib/cassandra-all-1.2.9.jar:/home/jererc/spark/lib/cassandra-thrift-1.2.9.jar:/home/jererc/spark/lib/datanucleus-api-jdo-3.2.1.jar:/home/jererc/spark/lib/datanucleus-core-3.2.2.jar:/home/jererc/spark/lib/datanucleus-rdbms-3.2.1.jar:/home/jererc/spark/lib/hadoop-core-0.20.205.0.jar:/home/jererc/spark/lib/hive-cassandra-1.2.9.jar:/home/jererc/spark/lib/mysql-connector-java.jar:/home/jererc/spark/lib/spark-assembly-1.1.0-hadoop1.0.4.jar:/home/jererc/spark/lib/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar:/home/jererc/spark/lib/spark-examples-1.1.0-hadoop1.0.4.jar:').
This is deprecated in Spark 1.0+.

Please instead use:
 - ./spark-submit with --driver-class-path to augment the driver classpath
 - spark.executor.extraClassPath to augment the executor classpath

14/11/20 14:55:35 WARN spark.SparkConf: Setting
'spark.executor.extraClassPath' to
'/home/jererc/spark/lib/cassandra-all-1.2.9.jar:/home/jererc/spark/lib/cassandra-thrift-1.2.9.jar:/home/jererc/spark/lib/datanucleus-api-jdo-3.2.1.jar:/home/jererc/spark/lib/datanucleus-core-3.2.2.jar:/home/jererc/spark/lib/datanucleus-rdbms-3.2.1.jar:/home/jererc/spark/lib/hadoop-core-0.20.205.0.jar:/home/jererc/spark/lib/hive-cassandra-1.2.9.jar:/home/jererc/spark/lib/mysql-connector-java.jar:/home/jererc/spark/lib/spark-assembly-1.1.0-hadoop1.0.4.jar:/home/jererc/spark/lib/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar:/home/jererc/spark/lib/spark-examples-1.1.0-hadoop1.0.4.jar:'
as a work-around.
Exception in thread "main" org.apache.spark.SparkException: Found both
spark.driver.extraClassPath and SPARK_CLASSPATH. Use only the former.
        at
org.apache.spark.SparkConf$$anonfun$validateSettings$5$$anonfun$apply$6.apply(SparkConf.scala:300)
        at
org.apache.spark.SparkConf$$anonfun$validateSettings$5$$anonfun$apply$6.apply(SparkConf.scala:298)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at
org.apache.spark.SparkConf$$anonfun$validateSettings$5.apply(SparkConf.scala:298)
        at
org.apache.spark.SparkConf$$anonfun$validateSettings$5.apply(SparkConf.scala:286)
        at scala.Option.foreach(Option.scala:236)
        at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:286)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:158)
        at
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:36)
        at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:57)
        at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


And if I don't use SPARK_CLASSPATH then spark-sql does not work.
I tried ADD_JARS without much success.

What's the best way to set the CLASSPATH and the jars?




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/tableau-spark-sql-cassandra-tp19282p19379.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to