Hi, Just to update the community on the status.
Fokko also reached out to include Parquet Java 1.16.0 in this release. Vote just passed in the Parquet community. We are waiting for the binary release. We will try to include it in the 1.10.0 release. Reviews are welcomed. https://github.com/apache/iceberg/pull/1394 We also ran into a couple of issues with the release script/process. 1) staging-binaries.sh has race conditions on concurrent publish and 2 folders in Maven repo. I thought this PR will fix the issue. Initially, it worked well with a few runs. But later I am still experiencing the same problem. Suggestions are appreciated! https://github.com/apache/iceberg/pull/13958 2) Yuya found out that the iceberg-api module wasn't published in the RC2 staging (1243). https://repository.apache.org/content/repositories/orgapacheiceberg-1243/ The first release issue is the more annoying/impacting problem. the second release issue is uncommon, as I didn't see it in a few other runs of staging-binaries.sh. Thanks, Steven On Sun, Aug 31, 2025 at 12:48 PM Steven Wu <stevenz...@gmail.com> wrote: > I started a vote thread for 1.10.0 RC2. > > I have to fix a couple of release script issues. Hence the first release > candidate is RC2 to vote. > > On Fri, Aug 29, 2025 at 9:53 AM Kevin Liu <kevinjq...@apache.org> wrote: > >> Thanks Steven! I did another pass to check for feature parity between >> spark 3.5 and spark 4.0 for this release and everything looks good. There >> are a few test cases that have not been ported, but we can punt those for >> now. >> >> Best, >> Kevin Liu >> >> On Thu, Aug 28, 2025 at 7:08 PM Steven Wu <stevenz...@gmail.com> wrote: >> >>> Thanks to Fokko and Ryan, the unknown type support PR was merged today. >>> >>> Everything in the 1.10.0 milestone is closed now. >>> >>> I will work on a release candidate next. >>> >>> On Fri, Aug 8, 2025 at 6:14 AM Fokko Driesprong <fo...@apache.org> >>> wrote: >>> >>>> Hi Steven, >>>> >>>> Thanks for updating this thread. >>>> >>>> I've updated the UnknownType PR >>>> <https://github.com/apache/iceberg/pull/13445> to first block on the >>>> complex cases that will require some more discussion. This way we can >>>> revisit this also after the 1.10.0 release. >>>> >>>> Kind regards, >>>> Fokko >>>> >>>> >>>> >>>> >>>> Op do 7 aug 2025 om 23:56 schreef Steven Wu <stevenz...@gmail.com>: >>>> >>>>> edited the subject line as we are into August. >>>>> >>>>> We are still waiting for the following two changes for the 1.10.0 >>>>> release >>>>> * Anton's fix for the data frame join using the same snapshot, which >>>>> will introduce a slight behavior change in spark 4.0. >>>>> * unknown type support. >>>>> >>>>> >>>>> On Fri, Aug 1, 2025 at 6:56 AM Alexandre Dutra <adu...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi Steven, >>>>>> >>>>>> A small regression with S3 signing has been reported to me. The fix >>>>>> is simple: >>>>>> >>>>>> https://github.com/apache/iceberg/pull/13718 >>>>>> >>>>>> Would it be still possible to have it in 1.10 please? >>>>>> >>>>>> Thanks, >>>>>> Alex >>>>>> >>>>>> >>>>>> On Thu, Jul 31, 2025 at 7:19 PM Steven Wu <stevenz...@gmail.com> >>>>>> wrote: >>>>>> > >>>>>> > Currently, the 1.10.0 milestone have no open PRs >>>>>> > https://github.com/apache/iceberg/milestone/54 >>>>>> > >>>>>> > The variant PR was merged this and last week. There are still some >>>>>> variant testing related PRs, which are probably not blockers for 1.10.0 >>>>>> release. >>>>>> > * Spark variant read: https://github.com/apache/iceberg/pull/13219 >>>>>> > * use short strings: https://github.com/apache/iceberg/pull/13284 >>>>>> > >>>>>> > We are still waiting for the following two changes >>>>>> > * Anton's fix for the data frame join using the same snapshot, >>>>>> which will introduce a slight behavior change in spark 4.0. >>>>>> > * unknown type support. Fokko raised a discussion thread on a >>>>>> blocking issue. >>>>>> > >>>>>> > Anything else did I miss? >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong <fo...@apache.org> >>>>>> wrote: >>>>>> >> >>>>>> >> Hey all, >>>>>> >> >>>>>> >> The read path for the UnknownType needs some community discussion. >>>>>> I've raised a separate thread. PTAL >>>>>> >> >>>>>> >> Kind regards from Belgium, >>>>>> >> Fokko >>>>>> >> >>>>>> >> Op za 26 jul 2025 om 00:58 schreef Ryan Blue <rdb...@gmail.com>: >>>>>> >>> >>>>>> >>> I thought that we said we wanted to get support out for v3 >>>>>> features in this release unless there is some reasonable blocker, like >>>>>> Spark not having geospatial types. To me, I think that means we should >>>>>> aim >>>>>> to get variant and unknown done so that we have a complete implementation >>>>>> with a major engine. And it should not be particularly difficult to get >>>>>> unknown done so I'd opt to get it in. >>>>>> >>> >>>>>> >>> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu <stevenz...@gmail.com> >>>>>> wrote: >>>>>> >>>> >>>>>> >>>> > I believe we also wanted to get in at least the read path for >>>>>> UnknownType. Fokko has a WIP PR for that. >>>>>> >>>> I thought in the community sync the consensus is that this is >>>>>> not a blocker, because it is a new feature implementation. If it is >>>>>> ready, >>>>>> it will be included. >>>>>> >>>> >>>>>> >>>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <kevinjq...@apache.org> >>>>>> wrote: >>>>>> >>>>> >>>>>> >>>>> I think Fokko's OOO. Should we help with that PR? >>>>>> >>>>> >>>>>> >>>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner < >>>>>> etudenhoef...@apache.org> wrote: >>>>>> >>>>>> >>>>>> >>>>>> I believe we also wanted to get in at least the read path for >>>>>> UnknownType. Fokko has a WIP PR for that. >>>>>> >>>>>> >>>>>> >>>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu < >>>>>> stevenz...@gmail.com> wrote: >>>>>> >>>>>>> >>>>>> >>>>>>> 3. Spark: fix data frame join based on different versions of >>>>>> the same table that may lead to weird results. Anton is working on a fix. >>>>>> It requires a small behavior change (table state may be stale up to >>>>>> refresh >>>>>> interval). Hence it is better to include it in the 1.10.0 release where >>>>>> Spark 4.0 is first supported. >>>>>> >>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is >>>>>> very close and will prioritize the review. >>>>>> >>>>>>> >>>>>> >>>>>>> We still have the above two issues pending. 3 doesn't have a >>>>>> PR yet. PR for 4 is not associated with the milestone yet. >>>>>> >>>>>>> >>>>>> >>>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu < >>>>>> kevinjq...@apache.org> wrote: >>>>>> >>>>>>>> >>>>>> >>>>>>>> Thanks everyone for the review. The 2 PRs are both merged. >>>>>> >>>>>>>> Looks like there's only 1 PR left in the 1.10 milestone :) >>>>>> >>>>>>>> >>>>>> >>>>>>>> Best, >>>>>> >>>>>>>> Kevin Liu >>>>>> >>>>>>>> >>>>>> >>>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang < >>>>>> owenzhang1...@gmail.com> wrote: >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> Thanks Kevin. The first change is not in the versioned doc >>>>>> so it can be released anytime. >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> Regards, >>>>>> >>>>>>>>> Manu >>>>>> >>>>>>>>> >>>>>> >>>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu < >>>>>> kevinjq...@apache.org> wrote: >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> The 3 PRs above are merged. Thanks everyone for the review. >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are >>>>>> both nice-to-haves. >>>>>> >>>>>>>>>> - docs: add subpage for REST Catalog Spec in >>>>>> "Specification" #13521 >>>>>> >>>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for >>>>>> rest fixture #13599 >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> The first one changes the link for "REST Catalog Spec" on >>>>>> the left nav of https://iceberg.apache.org/spec/ from the swagger.io >>>>>> link to a dedicated page for IRC. >>>>>> >>>>>>>>>> The second one fixes the default behavior of >>>>>> `iceberg-rest-fixture` image to align with the general expectation when >>>>>> creating a table in a catalog. >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> Please take a look. I would like to have both of these as >>>>>> part of the 1.10 release. >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> Best, >>>>>> >>>>>>>>>> Kevin Liu >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> >>>>>> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu < >>>>>> kevinjq...@apache.org> wrote: >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> Here are the 3 PRs to add corresponding tests. >>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13648 >>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13649 >>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13650 >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for CI >>>>>> to complete :) >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> Best, >>>>>> >>>>>>>>>>> Kevin Liu >>>>>> >>>>>>>>>>> >>>>>> >>>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu < >>>>>> stevenz...@gmail.com> wrote: >>>>>> >>>>>>>>>>>> >>>>>> >>>>>>>>>>>> Kevin, thanks for checking that. I will take a look at >>>>>> your backport PRs. Can you add them to the 1.10.0 milestone? >>>>>> >>>>>>>>>>>> >>>>>> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu < >>>>>> kevinjq...@apache.org> wrote: >>>>>> >>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>> Thanks again for driving this Steven! We're very close!! >>>>>> >>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>> As mentioned in the community sync today, I wanted to >>>>>> verify feature parity between Spark 3.5 and Spark 4.0 for this release. >>>>>> >>>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have >>>>>> feature parity for this upcoming release. More details in the other >>>>>> devlist >>>>>> thread >>>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f >>>>>> >>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>> Thanks, >>>>>> >>>>>>>>>>>>> Kevin Liu >>>>>> >>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu < >>>>>> stevenz...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> Another update on the release. >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> The existing blocker PRs are almost done. >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> During today's community sync, we identified the >>>>>> following issues/PRs to be included in the 1.10.0 release. >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> backport of PR 13100 to the main branch. I have >>>>>> created a cherry-pick PR for that. There is a one line difference >>>>>> compared >>>>>> to the original PR due to the removal of the deprecated RemoveSnapshot >>>>>> class in main branch for 1.10.0 target. Amogh has suggested using >>>>>> RemoveSnapshots with a single snapshot id, which should be supported by >>>>>> all >>>>>> REST catalog servers. >>>>>> >>>>>>>>>>>>>> Flink compaction doesn't support row lineage. Fail the >>>>>> compaction for V3 tables. I created a PR for that. Will backport after it >>>>>> is merged. >>>>>> >>>>>>>>>>>>>> Spark: fix data frame join based on different versions >>>>>> of the same table that may lead to weird results. Anton is working on a >>>>>> fix. It requires a small behavior change (table state may be stale up to >>>>>> refresh interval). Hence it is better to include it in the 1.10.0 release >>>>>> where Spark 4.0 is first supported. >>>>>> >>>>>>>>>>>>>> Variant support in core and Spark 4.0. Ryan thinks >>>>>> this is very close and will prioritize the review. >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> Thanks, >>>>>> >>>>>>>>>>>>>> steven >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> The 1.10.0 milestone can be found here. >>>>>> >>>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54 >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu < >>>>>> stevenz...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include >>>>>> the PR in the 1.10.0 milestone. >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt >>>>>> <ro...@confluent.io.invalid> wrote: >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent >>>>>> point of view, we will not be able to publish the connector on Confluent >>>>>> Hub until this CVE[1] is fixed. >>>>>> >>>>>>>>>>>>>>>> Since we would not publish a snapshot build, if the >>>>>> fix doesn't make it into 1.10 then we'd have to wait for 1.11 (or a dot >>>>>> release of 1.10) to be able to include the connector on Confluent Hub. >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> Thanks, Robin. >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> [1] >>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861 >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat < >>>>>> ajanthab...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> I have approached Confluent people to help us >>>>>> publish the OSS Kafka Connect Iceberg sink plugin. >>>>>> >>>>>>>>>>>>>>>>> It seems we have a CVE from dependency that blocks >>>>>> us from publishing the plugin. >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release >>>>>> which fixes that. >>>>>> >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561 >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> - Ajantha >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu < >>>>>> stevenz...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>> > Engines may model operations as >>>>>> deleting/inserting rows or as modifications to rows that preserve row >>>>>> ids. >>>>>> >>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some >>>>>> context. The first half (as deleting/inserting rows) is probably about >>>>>> the >>>>>> row lineage handling with equality deletes, which is described in another >>>>>> place. >>>>>> >>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows >>>>>> updated via Equality Deletes, because engines using equality deletes >>>>>> avoid >>>>>> reading existing data before writing changes and can't provide the >>>>>> original >>>>>> row ID for the new rows. These updates are always treated as if the >>>>>> existing row was completely removed and a unique new row was added." >>>>>> >>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang < >>>>>> owenzhang1...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the >>>>>> following sentence is a bit hard to understand (maybe just me) >>>>>> >>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>> Engines may model operations as >>>>>> deleting/inserting rows or as modifications to rows that preserve row >>>>>> ids. >>>>>> >>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>> Can you please help to explain? >>>>>> >>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日 >>>>>> 周二04:41写道: >>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>> Manu >>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry >>>>>> over (for replace) >>>>>> >>>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage >>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different >>>>>> data file for any reason, writers should write _row_id and >>>>>> _last_updated_sequence_number according to the following rules:" >>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>> >>>>>>>>>>>>>>>>>>>> Steven >>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu < >>>>>> stevenz...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> another update on the release. >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0 >>>>>> milestone (with 25 closed PRs). Amogh is actively working on the last >>>>>> blocker PR. >>>>>> >>>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on >>>>>> compaction >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the >>>>>> above blocker is merged and backported. >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>> >>>>>>>>>>>>>>>>>>>>> Steven >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang < >>>>>> owenzhang1...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Amogh, >>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that "replace" >>>>>> operation should carry over existing lineage info insteading of assigning >>>>>> new IDs? If not, we'd better firstly define it in spec because all >>>>>> engines >>>>>> and implementations need to follow it. >>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh >>>>>> Jahagirdar <2am...@gmail.com> wrote: >>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure >>>>>> works with row lineage before release is data file compaction. At the >>>>>> moment, it looks like compaction will read the records from the data >>>>>> files >>>>>> without projecting the lineage fields. What this means is that on write >>>>>> of >>>>>> the new compacted data files we'd be losing the lineage information. >>>>>> There's no data change in a compaction but we do need to make sure the >>>>>> lineage info from carried over records is materialized in the newly >>>>>> compacted files so they don't get new IDs or inherit the new file >>>>>> sequence >>>>>> number. I'm working on addressing this as well, but I'd call this out as >>>>>> a >>>>>> blocker as well. >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> -- >>>>>> >>>>>>>>>>>>>>>> Robin Moffatt >>>>>> >>>>>>>>>>>>>>>> Sr. Principal Advisor, Streaming Data Technologies >>>>>> >>>>>