Hi Can you check the permission of the user running spark On the hdfs folder where it tries to create the table
On Tue, Jun 18, 2019, 15:05 Nirmal Kumar <nirmal.ku...@impetus.co.in.invalid> wrote: > Hi List, > > I tried running the following sample Java code using Spark2 version 2.0.0 > on YARN (HDP-2.5.0.0) > > public class SparkSQLTest { > public static void main(String[] args) { > SparkSession sparkSession = SparkSession.builder().master("yarn") > .config("spark.sql.warehouse.dir", "/apps/hive/warehouse") > .config("hive.metastore.uris", "thrift://xxxxxxxxx:9083") > .config("spark.driver.extraJavaOptions", > "-Dhdp.version=2.5.0.0-1245") > .config("spark.yarn.am.extraJavaOptions", > "-Dhdp.version=2.5.0.0-1245") > .config("spark.yarn.jars", > "hdfs:///tmp/lib/spark2/*").enableHiveSupport().getOrCreate(); > > sparkSession.sql("insert into testdb.employee_orc select * from > testdb.employee where empid<5"); > } > } > > I get the following error pointing to a local file system > (file:/home/hive/spark-warehouse) wondering from where its being picked: > > 16:08:21.321 [dispatcher-event-loop-7] INFO > org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in > memory on 192.168.218.92:40831 (size: 30.6 KB, free: 4.0 GB) > 16:08:21.322 [main] DEBUG org.apache.spark.storage.BlockManagerMaster - > Updated info of block broadcast_0_piece0 > 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Told > master about block broadcast_0_piece0 > 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Put > block broadcast_0_piece0 locally took 4 ms > 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Putting > block broadcast_0_piece0 without replication took 4 ms > 16:08:21.326 [main] INFO org.apache.spark.SparkContext - Created broadcast > 0 from sql at SparkSQLTest.java:33 > 16:08:21.449 [main] DEBUG > org.apache.spark.sql.hive.execution.InsertIntoHiveTable - Created staging > dir = > file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1 > for path = file:/home/hive/spark-warehouse/testdb.db/employee_orc > 16:08:21.451 [main] INFO org.apache.hadoop.hive.common.FileUtils - > Creating directory if it doesn't exist: > file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1 > Exception in thread "main" java.lang.IllegalStateException: Cannot create > staging directory > 'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1' > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getStagingDir(InsertIntoHiveTable.scala:83) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalScratchDir(InsertIntoHiveTable.scala:97) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalTmpPath(InsertIntoHiveTable.scala:105) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:148) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:142) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:313) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582) > at > com.xxxx.xxx.xxx.xxx.xxxx.SparkSQLTest.main(SparkSQLTest.java:33) > 16:08:21.454 [pool-8-thread-1] INFO org.apache.spark.SparkContext - > Invoking stop() from shutdown hook > 16:08:21.455 [pool-8-thread-1] DEBUG > org.spark_project.jetty.util.component.AbstractLifeCycle - stopping > org.spark_project.jetty.server.Server@620aa4ea > 16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.server.Server > - Graceful shutdown org.spark_project.jetty.server.Server@620aa4ea by > > Thanks, > -Nirmal > > ________________________________ > > > > > > > NOTE: This message may contain information that is confidential, > proprietary, privileged or otherwise protected by law. The message is > intended solely for the named addressee. If received in error, please > destroy and notify the sender. Any use of this email is prohibited when > received in error. Impetus does not represent, warrant and/or guarantee, > that the integrity of this communication has been maintained nor that the > communication is free of errors, virus, interception or interference. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org