Some more info I’m still digging. I’m just trying to do `spark.table(“db.table”).count`from a spark-shell “db.table” is just a hive table.
At commit b67668b this worked just fine and it returned the number of rows in db.table. Starting at ca99171 "[SPARK-15073][SQL] Hide SparkSession constructor from the public” it fails with org.apache.spark.sql.AnalysisException: Database ‘db' does not exist; at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.requireDbExists(ExternalCatalog.scala:37) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.tableExists(InMemoryCatalog.scala:195) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.requireTableExists(InMemoryCatalog.scala:63) at org.apache.spark.sql.catalyst.catalog.InMemoryCatalog.getTable(InMemoryCatalog.scala:186) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupRelation(SessionCatalog.scala:337) at org.apache.spark.sql.SparkSession.table(SparkSession.scala:524) at org.apache.spark.sql.SparkSession.table(SparkSession.scala:520) ... 48 elided If I run "org.apache.spark.sql.SparkSession.builder.enableHiveSupport.getOrCreate.catalog.listDatabases.show(False)” +-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+ |name |description|locationUri| +-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+-----------+ |Database[name='default', description='default database', path='hdfs://ns/{CWD}/spark-warehouse']| +-------------------------------------------------------------------------------------------------------------------------------------------------+-----------+—————+ Where CWD is the current working directory of where I started my spark-shell. It looks like this commit causes spark.catalog to be the internal one instead of the Hive one. Michael, I dont this this is related to the HDFS configurations, they are in /etc/hadoop/conf on each of the nodes in the cluster. Arun, I was referring to these docs, http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/sql-programming-guide.html they need to be updated to no refer to HiveContext. I don’t think HiveContext should be marked as private[Hive], it should be public. I’ll keep digging. Doug > On May 19, 2016, at 6:52 PM, Reynold Xin <r...@databricks.com> wrote: > > The old one is deprecated but should still work though. > > > On Thu, May 19, 2016 at 3:51 PM, Arun Allamsetty <arun.allamse...@gmail.com> > wrote: > Hi Doug, > > If you look at the API docs here: > http://home.apache.org/~pwendell/spark-releases/spark-2.0.0-preview-docs/api/scala/index.html#org.apache.spark.sql.hive.HiveContext, > you'll see > Deprecate (Since version 2.0.0) Use SparkSession.builder.enableHiveSupport > instead > So you probably need to use that. > > Arun > > On Thu, May 19, 2016 at 3:44 PM, Michael Armbrust <mich...@databricks.com> > wrote: > 1. “val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)” doesn’t > work because “HiveContext not a member of org.apache.spark.sql.hive” I > checked the documentation, and it looks like it should still work for > spark-2.0.0-preview-bin-hadoop2.7.tgz > > HiveContext has been deprecated and moved to a 1.x compatibility package, > which you'll need to include explicitly. Docs have not been updated yet. > > 2. I also tried the new spark session, ‘spark.table(“db.table”)’, it fails > with a HDFS permission denied can’t write to “/user/hive/warehouse” > > Where are the HDFS configurations located? We might not be propagating them > correctly any more. > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org