Sunny malik created SPARK-50759:
-----------------------------------

             Summary: Spark catalog api bug when working with non-hms based 
catalog
                 Key: SPARK-50759
                 URL: https://issues.apache.org/jira/browse/SPARK-50759
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.4.0, 3.3.0
            Reporter: Sunny malik


Hi

I am encountering issues while working with a REST-based catalog. My Spark 
session is configured with a default catalog that uses the REST-based 
implementation.

The {{SparkSession.catalog}} API does not function correctly with the 
REST-based catalog. This issue has been tested and observed in Spark 3.4.

----------------------------------------------------------------------------------

${SPARK_HOME}/bin/spark-shell --master local[*]

--driver-memory 2g
--conf 
spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
--conf 
spark.sql.catalog.iceberg.uri=[https://xx.xxx.xxxx.domain.com|https://xx.xxx.xxxx.domain.com/]
--conf spark.sql.warehouse.dir=$SQL_WAREHOUSE_DIR
--conf spark.sql.defaultCatalog=iceberg
--conf spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
--conf 
spark.sql.catalog.iceberg.catalog-impl=org.apache.iceberg.rest.RESTCatalog \

scala> spark.catalog.currentCatalog
res1: String = iceberg

scala> spark.sql("select * from restDb.restTable").show
+---+----------+
| id| data|
+---+----------+
| 1|some_value|
+---+----------+

scala> spark.catalog.tableExists("restDb.restTable")
*res3: Boolean = true*

scala> spark.catalog.tableExists("restDb", "restTable")
*res4: Boolean = false*

----------------------------------------------------------------------------------
 
API spark.catalog.tableExists(String databaseName, String tableName) 
 is only meant to work with HMS based catalog 
([https://github.com/apache/spark/blob/5a91172c019c119e686f8221bbdb31f59d3d7776/sql/core/src/main/scala/org/apache/spark/sql/catalog/Catalog.scala#L224])
 
spark.catalog.tableExists(String databaseName, String tableName) 
  is meant to work with hms and non-hms based catalogs 
 
 
Suggested resolutions
1. API spark.catalog.tableExists(String databaseName, String tableName) to 
throw runtime exception if session catalog is non-hms based catalog
2. Deprecrate HMS specific API in newer Spark release as Spark already have API 
that can work with hms and non-hms based catalogs.
 
Thanks
Sunny



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to