Hello, I want to expose result of Spark computation to external tools. I plan to do this with Thrift server JDBC interface by registering result Dataframe as temp table. I wrote a sample program in spark-shell to test this.
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) > import hiveContext.implicits._ > HiveThriftServer2.startWithContext(hiveContext) > val myDF = > hiveContext.read.format("com.databricks.spark.csv").option("header", > "true").load("/datafolder/weblog/pages.csv") > myDF.registerTempTable("temp_table") I'm able to see the temp table in Beeline +-------------+--------------+ > | tableName | isTemporary | > +-------------+--------------+ > | temp_table | true | > | my_table | false | > +-------------+--------------+ Now when I issue "select * from temp_table" from Beeline, I see below exception in spark-shell 15/07/13 17:18:27 WARN ThriftCLIService: Error executing statement: org.apache.hive.service.cli.HiveSQLException: *java.lang.ClassNotFoundException: com.databricks.spark.csv.CsvRelation$$anonfun$buildScan$1$$anonfun$1* at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:206) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) I'm able to read the other table("my_table") from Beeline though. Any suggestions on how to overcome this? This is with Spark 1.4 pre-built version. Spark-shell was started with --package to pass spark-csv. Srikanth