It's a buggy behavior that a custom v2 catalog (without extending DelegatingCatalogExtension) expects Spark to still use the v1 DDL commands to operate on the tables inside it. This is also why the third-party catalogs (e.g. Unity Catalog and Apache Polaris) can not be used to overwrite `spark_catalog` if people still want to use the Spark built-in file sources.
Technically, I think it's wrong for a third-party catalog to rely on Spark's session catalog without extending `DelegatingCatalogExtension`, as it confuses Spark. If it has its own metastore, then it shouldn't delegate requests to the Spark session catalog and use v1 DDL commands which only work with the Spark session catalog. Otherwise, it should extend `DelegatingCatalogExtension` to indicate it. On Mon, Sep 23, 2024 at 11:19 AM Manu Zhang <owenzhang1...@gmail.com> wrote: > Hi Iceberg and Spark community, > > I'd like to bring your attention to a recent change[1] in Spark 3.5.3 that > effectively breaks Iceberg's SparkSessionCatalog[2] and blocks Iceberg > upgrading to Spark 3.5.3[3]. > > SparkSessionCatalog, as a customized Spark V2 session catalog, > supports creating a V1 table with V1 command. That's no longer allowed > after the change unless it extends DelegatingCatalogExtension. It is not > minor work since SparkSessionCatalog already extends a base class[4]. > > To resolve this issue, we have to make changes to public interfaces at > either Spark or Iceberg side. IMHO, it doesn't make sense for a downstream > project to refactor its interfaces when bumping up a maintenance version of > Spark. WDYT? > > > 1. https://github.com/apache/spark/pull/47724 > 2. > https://iceberg.apache.org/docs/nightly/spark-configuration/#replacing-the-session-catalog > 3. https://github.com/apache/iceberg/pull/11160 > <https://github.com/apache/iceberg/pull/11160> > 4. > https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java > > Thanks, > Manu > >