Hi Nirmal, i came across the following article " https://stackoverflow.com/questions/47497003/why-is-hive-creating-tables-in-the-local-file-system " (and an updated ref link : https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration ) you should check "hive.metastore.warehouse.dir" in hive config files
On Tue, Jun 18, 2019 at 8:09 PM Nirmal Kumar <nirmal.ku...@impetus.co.in> wrote: > Just an update on the thread that it's kerberized. > > I'm trying to execute the query with a different user xyz not hive. > Because seems like some permission issue the user xyz trying creating > directory in /home/hive directory > > Do i need some impersonation setting? > > Thanks, > Nirmal > > Get Outlook for Android<https://aka.ms/ghei36> > > ________________________________ > From: Nirmal Kumar > Sent: Tuesday, June 18, 2019 5:56:06 PM > To: Raymond Honderdors; Nirmal Kumar > Cc: user > Subject: RE: Unable to run simple spark-sql > > Hi Raymond, > > Permission on hdfs is 777 > drwxrwxrwx - impadmin hdfs 0 2019-06-13 16:09 > /home/hive/spark-warehouse > > > But it’s pointing to a local file system: > Exception in thread "main" java.lang.IllegalStateException: Cannot create > staging directory > 'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1' > > Thanks, > -Nirmal > > > From: Raymond Honderdors <raymond.honderd...@sizmek.com> > Sent: 18 June 2019 17:52 > To: Nirmal Kumar <nirmal.ku...@impetus.co.in.invalid> > Cc: user <user@spark.apache.org> > Subject: Re: Unable to run simple spark-sql > > Hi > Can you check the permission of the user running spark > On the hdfs folder where it tries to create the table > > On Tue, Jun 18, 2019, 15:05 Nirmal Kumar <nirmal.ku...@impetus.co.in > .invalid<mailto:nirmal.ku...@impetus.co.in.invalid>> wrote: > Hi List, > > I tried running the following sample Java code using Spark2 version 2.0.0 > on YARN (HDP-2.5.0.0) > > public class SparkSQLTest { > public static void main(String[] args) { > SparkSession sparkSession = SparkSession.builder().master("yarn") > .config("spark.sql.warehouse.dir", "/apps/hive/warehouse") > .config("hive.metastore.uris", "thrift://xxxxxxxxx:9083") > .config("spark.driver.extraJavaOptions", > "-Dhdp.version=2.5.0.0-1245") > .config("spark.yarn.am< > http://secure-web.cisco.com/1beuiC-aaBQJ0jgI7vONgZiTP5gCokYFEbllyW3ShZVdpQaIuYfuuEuS8iwzhqvwBE8C_E_bBe_7isO-HyPEVX6ZgJajKrQ6oWvTeBQCMjTHVCVImERG2S9qSHrH_mDzf656vrBFxAT1MYZhTZYzXl_3hyZ4BH-XCbKjXrCDyR1OR3tYqqDc7if9NJ1gqHWPwg84tho0__fut2d8y4XxMoMTQNnJzx5367QL6lYV5CFZj055coSLihVVYrh5jBID5jJF40PsrWSvdW7gJ_P6IAN9jTpHFJD7ZrokjlyS7WBAx5Mtnd2KxvNc2O6kKcxk2/http%3A%2F%2Fspark.yarn.am>.extraJavaOptions", > "-Dhdp.version=2.5.0.0-1245") > .config("spark.yarn.jars", > "hdfs:///tmp/lib/spark2/*").enableHiveSupport().getOrCreate(); > > sparkSession.sql("insert into testdb.employee_orc select * from > testdb.employee where empid<5"); > } > } > > I get the following error pointing to a local file system > (file:/home/hive/spark-warehouse) wondering from where its being picked: > > 16:08:21.321 [dispatcher-event-loop-7] INFO > org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in > memory on 192.168.218.92:40831< > http://secure-web.cisco.com/18zd_gzhF2N4NeZyolJRHaQMm3mYmE7J-u5p8lbMjuy7lxIZN8zgUUzR8pAzFfMxMiTknORj-329_qyn9tpyQcLejfGKtMK8lhr24CVjsWQVC_YXrT8Ie0c3rifE3KxpJ2y2k58cNtAr0je4JPtzOp6x1HuSmOHLU6CXb80FNn2yi0-PBSRKBHYDJVGU9TlTto9wpY5gkO3U-u7BLR69hXgrqotcSHjzbipPVbI1-HcKKcTbYaEFEqUkM7yy9XJiBfxeqYYJyvstG-5JMJ8Vu8R9DU7gRE0VWMYDNKWPF9KAk_ky4jPHMYHf_DEJimDFI9l0OCyJlELPQs0iw1M6d5g/http%3A%2F%2F192.168.218.92%3A40831> > (size: 30.6 KB, free: 4.0 GB) > 16:08:21.322 [main] DEBUG org.apache.spark.storage.BlockManagerMaster - > Updated info of block broadcast_0_piece0 > 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Told > master about block broadcast_0_piece0 > 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Put > block broadcast_0_piece0 locally took 4 ms > 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Putting > block broadcast_0_piece0 without replication took 4 ms > 16:08:21.326 [main] INFO org.apache.spark.SparkContext - Created broadcast > 0 from sql at SparkSQLTest.java:33 > 16:08:21.449 [main] DEBUG > org.apache.spark.sql.hive.execution.InsertIntoHiveTable - Created staging > dir = > file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1 > for path = file:/home/hive/spark-warehouse/testdb.db/employee_orc > 16:08:21.451 [main] INFO org.apache.hadoop.hive.common.FileUtils - > Creating directory if it doesn't exist: > file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1 > Exception in thread "main" java.lang.IllegalStateException: Cannot create > staging directory > 'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1' > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getStagingDir(InsertIntoHiveTable.scala:83) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalScratchDir(InsertIntoHiveTable.scala:97) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalTmpPath(InsertIntoHiveTable.scala:105) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:148) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:142) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:313) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582) > at > com.xxxx.xxx.xxx.xxx.xxxx.SparkSQLTest.main(SparkSQLTest.java:33) > 16:08:21.454 [pool-8-thread-1] INFO org.apache.spark.SparkContext - > Invoking stop() from shutdown hook > 16:08:21.455 [pool-8-thread-1] DEBUG > org.spark_project.jetty.util.component.AbstractLifeCycle - stopping > org.spark_project.jetty.server.Server@620aa4ea<mailto: > org.spark_project.jetty.server.Server@620aa4ea> > 16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.server.Server > - Graceful shutdown org.spark_project.jetty.server.Server@620aa4ea<mailto: > org.spark_project.jetty.server.Server@620aa4ea> by > > Thanks, > -Nirmal > > ________________________________ > > > > > > > NOTE: This message may contain information that is confidential, > proprietary, privileged or otherwise protected by law. The message is > intended solely for the named addressee. If received in error, please > destroy and notify the sender. Any use of this email is prohibited when > received in error. Impetus does not represent, warrant and/or guarantee, > that the integrity of this communication has been maintained nor that the > communication is free of errors, virus, interception or interference. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org<mailto: > user-unsubscr...@spark.apache.org> > > ________________________________ > > > > > > > NOTE: This message may contain information that is confidential, > proprietary, privileged or otherwise protected by law. The message is > intended solely for the named addressee. If received in error, please > destroy and notify the sender. Any use of this email is prohibited when > received in error. Impetus does not represent, warrant and/or guarantee, > that the integrity of this communication has been maintained nor that the > communication is free of errors, virus, interception or interference. > -- Raymond Honderdors R&D Tech Lead / Open Source evangelist raymond.honderd...@sizmek.com <first.l...@sizmek.com> w: +972732535698 Herzliya <https://info.sizmek.com/sizmek-advertising-suite>