I believe Aihua's approach is the most reasonable. For now, let's revert the core changes to avoid disrupting existing workflows. I've reviewed the PR and re-executed the failed tests.
If we're aligned on this direction, we should merge it into the 1.9.x branch and proceed with releasing version 1.9.1. In the longer term, do we have a policy that defines when clients are expected to deploy new versions of the REST catalog? If a feature is introduced as part of new functionality, it seems reasonable to assume that clients would also be using the updated version of the catalog. However, when migrating existing functionality, we need to ensure backward compatibility to avoid breaking current workflows—as we've seen in this case. Perhaps we should formalize Amogh's suggestion: that the client should be on the same version or one version behind the REST catalog it connects to. This would establish a clear compatibility boundary and help prevent issues like the one we just encountered. On Tue, May 20, 2025, 00:01 Aihua Xu <aihu...@gmail.com> wrote: > I think it makes sense to do a "partial revert" so the core for the client > will produce a single snapshot that the existing server understands. We can > upgrade the client in 1.10 to produce bulk snapshots. Please help check out > the PR <https://github.com/apache/iceberg/pull/13100>. > > Thanks, > Aihua > > On Mon, May 19, 2025 at 11:56 AM Amogh Jahagirdar <2am...@gmail.com> > wrote: > >> I'm +0 on a full revert in 1.9 since this really just has to do with >> client/server implementation behavior guarantees. >> What I think Ryan was suggesting was that if we could first enable the >> ability for servers to handle multiple snapshots and then on a subsequent >> release, clients could then produce multiple snapshots it becomes less >> abrupt of a behavior change. In this model there's a tolerance for the >> server being one minor version behind clients which I think is reasonable, >> certainly better than an abrupt change in behavior. >> >> In case folks agree, I'd say if we could produce a "partial revert" >> specifically for 1.9 for the client producing updates with multiple >> snapshots, that seems a lot more targeted and can get the benefits of the >> change in the 1.9 release. >> >> If we think it's simpler to just revert for 1.9 and cycle these proposed >> server changes for 1.10 and then the client changes for the release after >> 1.10, I think I'm OK (+0) with that as well. >> >> Thanks, >> Amogh J >> >> On Mon, May 19, 2025 at 12:12 PM Aihua Xu <aihu...@gmail.com> wrote: >> >>> Yeah. It's a change in 1.9.0 that was not caught in release. Seems >>> revert is pretty straightforward and I just submitted the PR >>> <https://github.com/apache/iceberg/pull/13098> if we are OK to revert >>> in 1.9.1. >>> >>> On Mon, May 19, 2025 at 10:26 AM Russell Spitzer < >>> russell.spit...@gmail.com> wrote: >>> >>>> As a heads up, this change >>>> <https://github.com/apache/iceberg/commit/06f667ada5a5b9edeaa20ae9269ff5de1721b91d> >>>> is already present in 1.9.0. We could hold off on 1.9.1 until we have a >>>> change that reverts the behavior in 1.9.0. I think that would be fine as >>>> long as we have a volunteer to work on it, I would be interested in just >>>> releasing 1.9.1 and then doing a 1.9.2 unless we are sure the fix/revert >>>> would be quick. >>>> >>>> On Mon, May 19, 2025 at 12:14 PM Ryan Blue <rdb...@gmail.com> wrote: >>>> >>>>> I think we should address the problem that Aihua pointed out. Even if >>>>> we can technically say that we are following the spec, this is a behavior >>>>> change that is known to break with existing REST catalog services. I don't >>>>> think that we should release a version that is known to break with >>>>> existing >>>>> services that were based on the previous Iceberg version. >>>>> >>>>> I suggest that we implement a fix to handle multiple snapshot IDs for >>>>> this release so that services can upgrade to 1.9 and then update clients >>>>> in >>>>> the next release. >>>>> >>>>> On Mon, May 19, 2025 at 10:03 AM Amogh Jahagirdar <2am...@gmail.com> >>>>> wrote: >>>>> >>>>>> Thanks Aihua and Ajantha who pointed this out, >>>>>> >>>>>> If I understand the issue correctly, I don't think I consider it as >>>>>> an incompatible change. The REST protocol always allowed for clients >>>>>> to remove snapshots in bulk >>>>>> <https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L2858>, >>>>>> it's just that we had a limitation in the reference implementation that >>>>>> the >>>>>> batch size is 1. I'm guessing the failure that's being seen on the server >>>>>> side is the assertion that the bulk size is 1 which is no longer the case >>>>>> from newer clients? >>>>>> >>>>>> So in this case, newer clients are trying to express deletions with >>>>>> larger sizes and the server is unable to handle it due to the assertion >>>>>> in >>>>>> the older implementation, not because the protocol changed. Though I can >>>>>> see the grey area in that it either forces clients to not upgrade for >>>>>> Java >>>>>> server implementations which haven't upgraded OR it server >>>>>> implementations >>>>>> end up upgrading, but this still feels implementation specific and not >>>>>> tied >>>>>> to the protocol compatibility. >>>>>> >>>>>> >>>>>> >>>>>> On Mon, May 19, 2025 at 10:29 AM Aihua Xu <aihu...@gmail.com> wrote: >>>>>> >>>>>>> I have verified RC against Snowflake build. Everything works except >>>>>>> one issue introduced by >>>>>>> https://github.com/apache/iceberg/pull/12670/ : the client with >>>>>>> 1.9.x can't work with the catalog server with old library to remove the >>>>>>> snapshots since the the client now will remove the snapshots in bulk >>>>>>> while >>>>>>> the old server doesn't support. Let me know if it's considered an >>>>>>> incompatible change. Otherwise, it looks good to me. >>>>>>> >>>>>>> On Mon, May 19, 2025 at 4:58 AM Péter Váry < >>>>>>> peter.vary.apa...@gmail.com> wrote: >>>>>>> >>>>>>>> +1 (binding) >>>>>>>> Verified signature, built, and run some tests >>>>>>>> >>>>>>>> Maximilian Michels <m...@apache.org> ezt írta (időpont: 2025. máj. >>>>>>>> 19., H, 11:17): >>>>>>>> >>>>>>>>> +1 (non-binding) >>>>>>>>> >>>>>>>>> 1. Verified the archive checksum and signature >>>>>>>>> 2. Extracted and inspected the source code for binaries >>>>>>>>> 3. Compiled and tested the source code >>>>>>>>> 4. Verified license files / headers >>>>>>>>> >>>>>>>>> -Max >>>>>>>>> >>>>>>>>> On Mon, May 19, 2025 at 6:52 AM Daniel Weeks <dwe...@apache.org> >>>>>>>>> wrote: >>>>>>>>> > >>>>>>>>> > +1 (binding) >>>>>>>>> > >>>>>>>>> > Verified sigs/sums/license/build/test >>>>>>>>> > >>>>>>>>> > Checked that the iceberg build version is correctly represented. >>>>>>>>> > >>>>>>>>> > Ran into the hadoop commit test timeouts, but succeeded on >>>>>>>>> re-attempt (I believe we have fixes upstream for this). >>>>>>>>> > >>>>>>>>> > -Dan >>>>>>>>> > >>>>>>>>> > On Sun, May 18, 2025 at 5:20 PM Steven Wu <stevenz...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >> >>>>>>>>> >> +1 (binding) >>>>>>>>> >> >>>>>>>>> >> Checked signature, checksum, and licenses. >>>>>>>>> >> >>>>>>>>> >> Also ran Flink 1.20 with SQL. >>>>>>>>> >> >>>>>>>>> >> Thanks Russel for driving the release! >>>>>>>>> >> >>>>>>>>> >> On Sun, May 18, 2025 at 2:27 PM huaxin gao < >>>>>>>>> huaxin.ga...@gmail.com> wrote: >>>>>>>>> >>> >>>>>>>>> >>> +1 (non-binding) >>>>>>>>> >>> Verified signature, checksum and license. Thanks Russell for >>>>>>>>> driving this release! >>>>>>>>> >>> >>>>>>>>> >>> Huaxin >>>>>>>>> >>> >>>>>>>>> >>> On Sun, May 18, 2025 at 2:03 PM Fokko Driesprong < >>>>>>>>> fo...@apache.org> wrote: >>>>>>>>> >>>> >>>>>>>>> >>>> +1 (binding) >>>>>>>>> >>>> >>>>>>>>> >>>> Checked signature, checksum, and licenses. >>>>>>>>> >>>> >>>>>>>>> >>>> Thanks Russell, for running this release! >>>>>>>>> >>>> >>>>>>>>> >>>> Kind regards, >>>>>>>>> >>>> Fokko >>>>>>>>> >>>> >>>>>>>>> >>>> Op zo 18 mei 2025 om 01:05 schreef Yuya Ebihara < >>>>>>>>> yuya.ebih...@starburstdata.com>: >>>>>>>>> >>>>> >>>>>>>>> >>>>> +1 (non-binding) >>>>>>>>> >>>>> >>>>>>>>> >>>>> Confirmed that Trino and Starburst CI are green. >>>>>>>>> >>>>> It runs tests against several catalogs, including HMS, Glue, >>>>>>>>> JDBC (PostgreSQL), REST (Polaris, Unity, S3 Tables, Tabular), Nessie, >>>>>>>>> and >>>>>>>>> Snowflake. >>>>>>>>> >>>>> >>>>>>>>> >>>>> BR, >>>>>>>>> >>>>> Yuya >>>>>>>>> >>>>> >>>>>>>>> >>>>> On Sun, May 18, 2025 at 2:13 AM Kevin Liu < >>>>>>>>> kevinjq...@apache.org> wrote: >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> +1 (non-binding) >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> - Verified signature, checksum, license. >>>>>>>>> >>>>>> * Build + test passed using Java 17 on M1 >>>>>>>>> >>>>>> * Ran a few examples on Spark >>>>>>>>> >>>>>> * Ran pyiceberg integration tests ( >>>>>>>>> https://github.com/apache/iceberg-python/pull/2011) >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> Best, >>>>>>>>> >>>>>> Kevin Liu >>>>>>>>> >>>>>> >>>>>>>>> >>>>>> On Sat, May 17, 2025 at 10:02 AM Jean-Baptiste Onofré < >>>>>>>>> j...@nanthrax.net> wrote: >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> Sorry I meant +1 (non binding) >>>>>>>>> >>>>>>> >>>>>>>>> >>>>>>> Le sam. 17 mai 2025 à 08:10, Jean-Baptiste Onofré < >>>>>>>>> j...@nanthrax.net> a écrit : >>>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>> +0 (non binding) >>>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>> - Signature and checksum are good >>>>>>>>> >>>>>>>> - ASF header present in expected file >>>>>>>>> >>>>>>>> - No binary found in the source distribution >>>>>>>>> >>>>>>>> - Build is OK >>>>>>>>> >>>>>>>> - Tested with spark and flink, need some update on Polaris >>>>>>>>> >>>>>>>> - The aws-bundle, azure-bundle, gcp-bundle, >>>>>>>>> kafka-connect-runtime >>>>>>>>> >>>>>>>> LICENSE should include content for MIT and BSD (inline or >>>>>>>>> dedicated >>>>>>>>> >>>>>>>> folder), also, in case of dual license, we should >>>>>>>>> "exclusively" select >>>>>>>>> >>>>>>>> one. I gonna fix that, as it's like this for a while (I >>>>>>>>> missed that >>>>>>>>> >>>>>>>> before), it can be fixed in next release. >>>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>> Regards >>>>>>>>> >>>>>>>> JB >>>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>> On Fri, May 16, 2025 at 11:32 PM Russell Spitzer >>>>>>>>> >>>>>>>> <russell.spit...@gmail.com> wrote: >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > Hi Y'all, >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > I propose that we release the following RC as the >>>>>>>>> official Apache Iceberg 1.9.1 release. >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > The commit ID is >>>>>>>>> 5541cf000084b9e139d8dd22db44db7f592c3a2d >>>>>>>>> >>>>>>>> > * This corresponds to the tag: apache-iceberg-1.9.1-rc0 >>>>>>>>> >>>>>>>> > * >>>>>>>>> https://github.com/apache/iceberg/commits/apache-iceberg-1.9.1-rc0 >>>>>>>>> >>>>>>>> > * >>>>>>>>> https://github.com/apache/iceberg/tree/5541cf000084b9e139d8dd22db44db7f592c3a2d >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > The release tarball, signature, and checksums are here: >>>>>>>>> >>>>>>>> > * >>>>>>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.9.1-rc0 >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > You can find the KEYS file here: >>>>>>>>> >>>>>>>> > * https://downloads.apache.org/iceberg/KEYS >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > Convenience binary artifacts are staged on Nexus. The >>>>>>>>> Maven repository URL is: >>>>>>>>> >>>>>>>> > * >>>>>>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1201/ >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > Please download, verify, and test. >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > Please vote in the next 72 hours. >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > [ ] +1 Release this as Apache Iceberg 1.9.1 >>>>>>>>> >>>>>>>> > [ ] +0 >>>>>>>>> >>>>>>>> > [ ] -1 Do not release this because... >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>> > Only PMC members have binding votes, but other >>>>>>>>> community members are encouraged to cast >>>>>>>>> >>>>>>>> > non-binding votes. This vote will pass if there are 3 >>>>>>>>> binding +1 votes and more binding >>>>>>>>> >>>>>>>> > +1 votes than -1 votes. >>>>>>>>> >>>>>>>> > >>>>>>>>> >>>>>>>>