Re: Unable to run simple spark-sql

Raymond Honderdors Tue, 18 Jun 2019 05:22:39 -0700

Hi
Can you check the permission of the user running spark
On the hdfs folder where it tries to create the table


On Tue, Jun 18, 2019, 15:05 Nirmal Kumar <nirmal.ku...@impetus.co.in.invalid>
wrote:

> Hi List,
>
> I tried running the following sample Java code using Spark2 version 2.0.0
> on YARN (HDP-2.5.0.0)
>
> public class SparkSQLTest {
>   public static void main(String[] args) {
>     SparkSession sparkSession = SparkSession.builder().master("yarn")
>         .config("spark.sql.warehouse.dir", "/apps/hive/warehouse")
>         .config("hive.metastore.uris", "thrift://xxxxxxxxx:9083")
>         .config("spark.driver.extraJavaOptions",
> "-Dhdp.version=2.5.0.0-1245")
>         .config("spark.yarn.am.extraJavaOptions",
> "-Dhdp.version=2.5.0.0-1245")
>         .config("spark.yarn.jars",
> "hdfs:///tmp/lib/spark2/*").enableHiveSupport().getOrCreate();
>
>     sparkSession.sql("insert into testdb.employee_orc select * from
> testdb.employee where empid<5");
>   }
> }
>
> I get the following error pointing to a local file system
> (file:/home/hive/spark-warehouse) wondering from where its being picked:
>
> 16:08:21.321 [dispatcher-event-loop-7] INFO
> org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in
> memory on 192.168.218.92:40831 (size: 30.6 KB, free: 4.0 GB)
> 16:08:21.322 [main] DEBUG org.apache.spark.storage.BlockManagerMaster -
> Updated info of block broadcast_0_piece0
> 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Told
> master about block broadcast_0_piece0
> 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Put
> block broadcast_0_piece0 locally took  4 ms
> 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Putting
> block broadcast_0_piece0 without replication took  4 ms
> 16:08:21.326 [main] INFO org.apache.spark.SparkContext - Created broadcast
> 0 from sql at SparkSQLTest.java:33
> 16:08:21.449 [main] DEBUG
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable - Created staging
> dir =
> file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
> for path = file:/home/hive/spark-warehouse/testdb.db/employee_orc
> 16:08:21.451 [main] INFO org.apache.hadoop.hive.common.FileUtils -
> Creating directory if it doesn't exist:
> file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
> Exception in thread "main" java.lang.IllegalStateException: Cannot create
> staging directory
> 'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1'
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getStagingDir(InsertIntoHiveTable.scala:83)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalScratchDir(InsertIntoHiveTable.scala:97)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalTmpPath(InsertIntoHiveTable.scala:105)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:148)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:142)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:313)
>         at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>         at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>         at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
>         at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>         at
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
>         at
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
>         at
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
>         at
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
>         at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
>        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167)
>         at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
>         at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
>         at
> com.xxxx.xxx.xxx.xxx.xxxx.SparkSQLTest.main(SparkSQLTest.java:33)
> 16:08:21.454 [pool-8-thread-1] INFO org.apache.spark.SparkContext -
> Invoking stop() from shutdown hook
> 16:08:21.455 [pool-8-thread-1] DEBUG
> org.spark_project.jetty.util.component.AbstractLifeCycle - stopping
> org.spark_project.jetty.server.Server@620aa4ea
> 16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.server.Server
> - Graceful shutdown org.spark_project.jetty.server.Server@620aa4ea by
>
> Thanks,
> -Nirmal
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Unable to run simple spark-sql

Reply via email to