Hi Russell,

Thanks for testing it out! It's a bit unfortunate that we found this issue
after the RC stage. I've made a fix for it:
https://github.com/apache/spark/pull/48257 . I think it should work but
let's confirm it. After it gets merged, we can probably wait for a while to
accumulate more fixes to land in branch-3.5 and make another release.

On Thu, Sep 26, 2024 at 5:59 AM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> Checked and extending Delegating Catalog Extension will be quite difficult
> or at least cause several breaks in current Iceberg SparkSessionCatalog
> implementations. Note this has nothing to do with third party catalogs but
> more directly with how Iceberg works with Spark regardless of Catalog
> implementation.
>
> Main issues on the Iceberg side:
>
> 1. Initialize is final and empty in DelegatingCatalogExtension. This means
> we have no way of taking custom catalog configuration and applying to the
> Iceberg plugin. Currently this is used for a few things; choosing the
> underlying Iceberg catalog implementation, catalog cache settings, Iceberg
> environment context.
>
> 2. No access to delegate catalog object. The delegate is private so we are
> unable to touch it in our extended class which is currently used for
> Iceberg's "staged create" and "staged replace" functions. Here we could
> just work around this by disabling staged create and replace if the
> delegate is being used but that would be a break iceberg behavior.
>
> Outside of these aspects I was able to get everything else working as
> expected but I think both of these are probably blockers.
>
>
> On Wed, Sep 25, 2024 at 3:51 PM Russell Spitzer <russell.spit...@gmail.com>
> wrote:
>
>> I think it should be minimally difficult to switch this around on the
>> Iceberg side, we only have to move the initialize code out and duplicate
>> it. Not a huge cost
>>
>> On Sun, Sep 22, 2024 at 11:39 PM Wenchen Fan <cloud0...@gmail.com> wrote:
>>
>>> It's a buggy behavior that a custom v2 catalog (without extending
>>> DelegatingCatalogExtension) expects Spark to still use the v1 DDL commands
>>> to operate on the tables inside it. This is also why the third-party
>>> catalogs (e.g. Unity Catalog and Apache Polaris) can not be used to
>>> overwrite `spark_catalog` if people still want to use the Spark built-in
>>> file sources.
>>>
>>> Technically, I think it's wrong for a third-party catalog to rely on
>>> Spark's session catalog without extending `DelegatingCatalogExtension`, as
>>> it confuses Spark. If it has its own metastore, then it shouldn't delegate
>>> requests to the Spark session catalog and use v1 DDL commands which only
>>> work with the Spark session catalog. Otherwise, it should extend
>>> `DelegatingCatalogExtension` to indicate it.
>>>
>>> On Mon, Sep 23, 2024 at 11:19 AM Manu Zhang <owenzhang1...@gmail.com>
>>> wrote:
>>>
>>>> Hi Iceberg and Spark community,
>>>>
>>>> I'd like to bring your attention to a recent change[1] in Spark 3.5.3
>>>> that effectively breaks Iceberg's SparkSessionCatalog[2] and blocks Iceberg
>>>> upgrading to Spark 3.5.3[3].
>>>>
>>>> SparkSessionCatalog, as a customized Spark V2 session catalog,
>>>> supports creating a V1 table with V1 command. That's no longer allowed
>>>> after the change unless it extends DelegatingCatalogExtension. It is not
>>>> minor work since SparkSessionCatalog already extends a base class[4].
>>>>
>>>> To resolve this issue, we have to make changes to public interfaces at
>>>> either Spark or Iceberg side. IMHO, it doesn't make sense for a downstream
>>>> project to refactor its interfaces when bumping up a maintenance version of
>>>> Spark. WDYT?
>>>>
>>>>
>>>> 1. https://github.com/apache/spark/pull/47724
>>>> 2.
>>>> https://iceberg.apache.org/docs/nightly/spark-configuration/#replacing-the-session-catalog
>>>> 3. https://github.com/apache/iceberg/pull/11160
>>>> <https://github.com/apache/iceberg/pull/11160>
>>>> 4.
>>>> https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java
>>>>
>>>> Thanks,
>>>> Manu
>>>>
>>>>

Reply via email to