Re: Unable to run simple spark-sql

Raymond Honderdors Tue, 18 Jun 2019 22:49:06 -0700

Hi Nirmal,
i came across the following article "
https://stackoverflow.com/questions/47497003/why-is-hive-creating-tables-in-the-local-file-system
"
(and an updated ref link :
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration
)
you should check "hive.metastore.warehouse.dir" in hive config files



On Tue, Jun 18, 2019 at 8:09 PM Nirmal Kumar <nirmal.ku...@impetus.co.in>
wrote:

> Just an update on the thread that it's kerberized.
>
> I'm trying to execute the query with a different user xyz not hive.
> Because seems like some permission issue the user xyz trying creating
> directory in /home/hive directory
>
> Do i need some impersonation setting?
>
> Thanks,
> Nirmal
>
> Get Outlook for Android<https://aka.ms/ghei36>
>
> ________________________________
> From: Nirmal Kumar
> Sent: Tuesday, June 18, 2019 5:56:06 PM
> To: Raymond Honderdors; Nirmal Kumar
> Cc: user
> Subject: RE: Unable to run simple spark-sql
>
> Hi Raymond,
>
> Permission on hdfs is 777
> drwxrwxrwx   - impadmin hdfs          0 2019-06-13 16:09
> /home/hive/spark-warehouse
>
>
> But it’s pointing to a local file system:
> Exception in thread "main" java.lang.IllegalStateException: Cannot create
> staging directory
> 'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1'
>
> Thanks,
> -Nirmal
>
>
> From: Raymond Honderdors <raymond.honderd...@sizmek.com>
> Sent: 18 June 2019 17:52
> To: Nirmal Kumar <nirmal.ku...@impetus.co.in.invalid>
> Cc: user <user@spark.apache.org>
> Subject: Re: Unable to run simple spark-sql
>
> Hi
> Can you check the permission of the user running spark
> On the hdfs folder where it tries to create the table
>
> On Tue, Jun 18, 2019, 15:05 Nirmal Kumar <nirmal.ku...@impetus.co.in
> .invalid<mailto:nirmal.ku...@impetus.co.in.invalid>> wrote:
> Hi List,
>
> I tried running the following sample Java code using Spark2 version 2.0.0
> on YARN (HDP-2.5.0.0)
>
> public class SparkSQLTest {
>   public static void main(String[] args) {
>     SparkSession sparkSession = SparkSession.builder().master("yarn")
>         .config("spark.sql.warehouse.dir", "/apps/hive/warehouse")
>         .config("hive.metastore.uris", "thrift://xxxxxxxxx:9083")
>         .config("spark.driver.extraJavaOptions",
> "-Dhdp.version=2.5.0.0-1245")
>         .config("spark.yarn.am<
> http://secure-web.cisco.com/1beuiC-aaBQJ0jgI7vONgZiTP5gCokYFEbllyW3ShZVdpQaIuYfuuEuS8iwzhqvwBE8C_E_bBe_7isO-HyPEVX6ZgJajKrQ6oWvTeBQCMjTHVCVImERG2S9qSHrH_mDzf656vrBFxAT1MYZhTZYzXl_3hyZ4BH-XCbKjXrCDyR1OR3tYqqDc7if9NJ1gqHWPwg84tho0__fut2d8y4XxMoMTQNnJzx5367QL6lYV5CFZj055coSLihVVYrh5jBID5jJF40PsrWSvdW7gJ_P6IAN9jTpHFJD7ZrokjlyS7WBAx5Mtnd2KxvNc2O6kKcxk2/http%3A%2F%2Fspark.yarn.am>.extraJavaOptions",
> "-Dhdp.version=2.5.0.0-1245")
>         .config("spark.yarn.jars",
> "hdfs:///tmp/lib/spark2/*").enableHiveSupport().getOrCreate();
>
>     sparkSession.sql("insert into testdb.employee_orc select * from
> testdb.employee where empid<5");
>   }
> }
>
> I get the following error pointing to a local file system
> (file:/home/hive/spark-warehouse) wondering from where its being picked:
>
> 16:08:21.321 [dispatcher-event-loop-7] INFO
> org.apache.spark.storage.BlockManagerInfo - Added broadcast_0_piece0 in
> memory on 192.168.218.92:40831<
> http://secure-web.cisco.com/18zd_gzhF2N4NeZyolJRHaQMm3mYmE7J-u5p8lbMjuy7lxIZN8zgUUzR8pAzFfMxMiTknORj-329_qyn9tpyQcLejfGKtMK8lhr24CVjsWQVC_YXrT8Ie0c3rifE3KxpJ2y2k58cNtAr0je4JPtzOp6x1HuSmOHLU6CXb80FNn2yi0-PBSRKBHYDJVGU9TlTto9wpY5gkO3U-u7BLR69hXgrqotcSHjzbipPVbI1-HcKKcTbYaEFEqUkM7yy9XJiBfxeqYYJyvstG-5JMJ8Vu8R9DU7gRE0VWMYDNKWPF9KAk_ky4jPHMYHf_DEJimDFI9l0OCyJlELPQs0iw1M6d5g/http%3A%2F%2F192.168.218.92%3A40831>
> (size: 30.6 KB, free: 4.0 GB)
> 16:08:21.322 [main] DEBUG org.apache.spark.storage.BlockManagerMaster -
> Updated info of block broadcast_0_piece0
> 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Told
> master about block broadcast_0_piece0
> 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Put
> block broadcast_0_piece0 locally took  4 ms
> 16:08:21.323 [main] DEBUG org.apache.spark.storage.BlockManager - Putting
> block broadcast_0_piece0 without replication took  4 ms
> 16:08:21.326 [main] INFO org.apache.spark.SparkContext - Created broadcast
> 0 from sql at SparkSQLTest.java:33
> 16:08:21.449 [main] DEBUG
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable - Created staging
> dir =
> file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
> for path = file:/home/hive/spark-warehouse/testdb.db/employee_orc
> 16:08:21.451 [main] INFO org.apache.hadoop.hive.common.FileUtils -
> Creating directory if it doesn't exist:
> file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1
> Exception in thread "main" java.lang.IllegalStateException: Cannot create
> staging directory
> 'file:/home/hive/spark-warehouse/testdb.db/employee_orc/.hive-staging_hive_2019-06-18_16-08-21_448_1691186175028734135-1'
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getStagingDir(InsertIntoHiveTable.scala:83)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalScratchDir(InsertIntoHiveTable.scala:97)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.getExternalTmpPath(InsertIntoHiveTable.scala:105)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:148)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:142)
>         at
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:313)
>         at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>         at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
>         at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
>         at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>         at
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
>         at
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
>         at
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
>         at
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
>         at org.apache.spark.sql.Dataset.<init>(Dataset.scala:186)
>        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:167)
>         at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:65)
>         at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:582)
>         at
> com.xxxx.xxx.xxx.xxx.xxxx.SparkSQLTest.main(SparkSQLTest.java:33)
> 16:08:21.454 [pool-8-thread-1] INFO org.apache.spark.SparkContext -
> Invoking stop() from shutdown hook
> 16:08:21.455 [pool-8-thread-1] DEBUG
> org.spark_project.jetty.util.component.AbstractLifeCycle - stopping
> org.spark_project.jetty.server.Server@620aa4ea<mailto:
> org.spark_project.jetty.server.Server@620aa4ea>
> 16:08:21.455 [pool-8-thread-1] DEBUG org.spark_project.jetty.server.Server
> - Graceful shutdown org.spark_project.jetty.server.Server@620aa4ea<mailto:
> org.spark_project.jetty.server.Server@620aa4ea> by
>
> Thanks,
> -Nirmal
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org<mailto:
> user-unsubscr...@spark.apache.org>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>


-- 





Raymond Honderdors

R&D Tech Lead / Open Source evangelist

raymond.honderd...@sizmek.com <first.l...@sizmek.com>

w: +972732535698

Herzliya
<https://info.sizmek.com/sizmek-advertising-suite>

Re: Unable to run simple spark-sql

Reply via email to