Hi Russell, Thanks for testing it out! It's a bit unfortunate that we found this issue after the RC stage. I've made a fix for it: https://github.com/apache/spark/pull/48257 . I think it should work but let's confirm it. After it gets merged, we can probably wait for a while to accumulate more fixes to land in branch-3.5 and make another release.
On Thu, Sep 26, 2024 at 5:59 AM Russell Spitzer <russell.spit...@gmail.com> wrote: > Checked and extending Delegating Catalog Extension will be quite difficult > or at least cause several breaks in current Iceberg SparkSessionCatalog > implementations. Note this has nothing to do with third party catalogs but > more directly with how Iceberg works with Spark regardless of Catalog > implementation. > > Main issues on the Iceberg side: > > 1. Initialize is final and empty in DelegatingCatalogExtension. This means > we have no way of taking custom catalog configuration and applying to the > Iceberg plugin. Currently this is used for a few things; choosing the > underlying Iceberg catalog implementation, catalog cache settings, Iceberg > environment context. > > 2. No access to delegate catalog object. The delegate is private so we are > unable to touch it in our extended class which is currently used for > Iceberg's "staged create" and "staged replace" functions. Here we could > just work around this by disabling staged create and replace if the > delegate is being used but that would be a break iceberg behavior. > > Outside of these aspects I was able to get everything else working as > expected but I think both of these are probably blockers. > > > On Wed, Sep 25, 2024 at 3:51 PM Russell Spitzer <russell.spit...@gmail.com> > wrote: > >> I think it should be minimally difficult to switch this around on the >> Iceberg side, we only have to move the initialize code out and duplicate >> it. Not a huge cost >> >> On Sun, Sep 22, 2024 at 11:39 PM Wenchen Fan <cloud0...@gmail.com> wrote: >> >>> It's a buggy behavior that a custom v2 catalog (without extending >>> DelegatingCatalogExtension) expects Spark to still use the v1 DDL commands >>> to operate on the tables inside it. This is also why the third-party >>> catalogs (e.g. Unity Catalog and Apache Polaris) can not be used to >>> overwrite `spark_catalog` if people still want to use the Spark built-in >>> file sources. >>> >>> Technically, I think it's wrong for a third-party catalog to rely on >>> Spark's session catalog without extending `DelegatingCatalogExtension`, as >>> it confuses Spark. If it has its own metastore, then it shouldn't delegate >>> requests to the Spark session catalog and use v1 DDL commands which only >>> work with the Spark session catalog. Otherwise, it should extend >>> `DelegatingCatalogExtension` to indicate it. >>> >>> On Mon, Sep 23, 2024 at 11:19 AM Manu Zhang <owenzhang1...@gmail.com> >>> wrote: >>> >>>> Hi Iceberg and Spark community, >>>> >>>> I'd like to bring your attention to a recent change[1] in Spark 3.5.3 >>>> that effectively breaks Iceberg's SparkSessionCatalog[2] and blocks Iceberg >>>> upgrading to Spark 3.5.3[3]. >>>> >>>> SparkSessionCatalog, as a customized Spark V2 session catalog, >>>> supports creating a V1 table with V1 command. That's no longer allowed >>>> after the change unless it extends DelegatingCatalogExtension. It is not >>>> minor work since SparkSessionCatalog already extends a base class[4]. >>>> >>>> To resolve this issue, we have to make changes to public interfaces at >>>> either Spark or Iceberg side. IMHO, it doesn't make sense for a downstream >>>> project to refactor its interfaces when bumping up a maintenance version of >>>> Spark. WDYT? >>>> >>>> >>>> 1. https://github.com/apache/spark/pull/47724 >>>> 2. >>>> https://iceberg.apache.org/docs/nightly/spark-configuration/#replacing-the-session-catalog >>>> 3. https://github.com/apache/iceberg/pull/11160 >>>> <https://github.com/apache/iceberg/pull/11160> >>>> 4. >>>> https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java >>>> >>>> Thanks, >>>> Manu >>>> >>>>