Hi Everyone, I'm stuck with one problem, where I need to provide a custom GCS location for the Hive table from Spark. The code fails while doing an *'insert into'* whenever my Hive table has a flag GS location like gs://<bucket_name>, but works for nested locations like gs://bucket_name/blob_name.
Is anyone aware if it's an issue from Spark side or any config I need to pass for it? *The issue is happening in 2.x and 3.x both.* Config using: spark.conf.set("spark.hadoop.hive.exec.dynamic.partition.mode", "nonstrict") spark.conf.set("spark.hadoop.hive.exec.dynamic.partition", true) spark.conf.set("hive.exec.dynamic.partition.mode","nonstrict") spark.conf.set("hive.exec.dynamic.partition", true) *Case 1 : FAILS* val DF = Seq(("test1", 123)).toDF("name", "num") val partKey = List("num").map(x => x) DF.write.option("path", "gs://test_dd1/").mode(SaveMode.Overwrite).partitionBy(partKey: _*).format("orc").saveAsTable("us_wm_supply_chain_otif_stg.test_tb1") val DF1 = Seq(("test2", 125)).toDF("name", "num") DF.write.mode(SaveMode.Overwrite).format("orc").insertInto("us_wm_supply_chain_otif_stg.test_tb1") *java.lang.NullPointerException at org.apache.hadoop.fs.Path.<init>(Path.java:141) at org.apache.hadoop.fs.Path.<init>(Path.java:120) at org.apache.hadoop.fs.Path.suffix(Path.java:441) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.$anonfun$getCustomPartitionLocations$1(InsertIntoHadoopFsRelationCommand.scala:254)* *Case 2: Succeeds * val DF = Seq(("test1", 123)).toDF("name", "num") val partKey = List("num").map(x => x) DF.write.option("path", "gs://test_dd1/abc/").mode(SaveMode.Overwrite).partitionBy(partKey: _*).format("orc").saveAsTable("us_wm_supply_chain_otif_stg.test_tb2") val DF1 = Seq(("test2", 125)).toDF("name", "num") DF1.write.mode(SaveMode.Overwrite).format("orc").insertInto("us_wm_supply_chain_otif_stg.test_tb2") With Best Regards, Dipayan Dev