Re: Iceberg 1.10.0 release update - August 2025

Steven Wu Sun, 31 Aug 2025 12:48:40 -0700

I started a vote thread for 1.10.0 RC2.

I have to fix a couple of release script issues. Hence the first release
candidate is RC2 to vote.


On Fri, Aug 29, 2025 at 9:53 AM Kevin Liu <[email protected]> wrote:

> Thanks Steven! I did another pass to check for feature parity between
> spark 3.5 and spark 4.0 for this release and everything looks good. There
> are a few test cases that have not been ported, but we can punt those for
> now.
>
> Best,
> Kevin Liu
>
> On Thu, Aug 28, 2025 at 7:08 PM Steven Wu <[email protected]> wrote:
>
>> Thanks to Fokko and Ryan, the unknown type support PR was merged today.
>>
>> Everything in the 1.10.0 milestone is closed now.
>>
>> I will work on a release candidate next.
>>
>> On Fri, Aug 8, 2025 at 6:14 AM Fokko Driesprong <[email protected]> wrote:
>>
>>> Hi Steven,
>>>
>>> Thanks for updating this thread.
>>>
>>> I've updated the UnknownType PR
>>> <https://github.com/apache/iceberg/pull/13445> to first block on the
>>> complex cases that will require some more discussion. This way we can
>>> revisit this also after the 1.10.0 release.
>>>
>>> Kind regards,
>>> Fokko
>>>
>>>
>>>
>>>
>>> Op do 7 aug 2025 om 23:56 schreef Steven Wu <[email protected]>:
>>>
>>>> edited the subject line as we are into August.
>>>>
>>>> We are still waiting for the following two changes for the 1.10.0
>>>> release
>>>> * Anton's fix for the data frame join using the same snapshot, which
>>>> will introduce a slight behavior change in spark 4.0.
>>>> * unknown type support.
>>>>
>>>>
>>>> On Fri, Aug 1, 2025 at 6:56 AM Alexandre Dutra <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Steven,
>>>>>
>>>>> A small regression with S3 signing has been reported to me. The fix is
>>>>> simple:
>>>>>
>>>>> https://github.com/apache/iceberg/pull/13718
>>>>>
>>>>> Would it be still possible to have it in 1.10 please?
>>>>>
>>>>> Thanks,
>>>>> Alex
>>>>>
>>>>>
>>>>> On Thu, Jul 31, 2025 at 7:19 PM Steven Wu <[email protected]>
>>>>> wrote:
>>>>> >
>>>>> > Currently, the 1.10.0 milestone have no open PRs
>>>>> > https://github.com/apache/iceberg/milestone/54
>>>>> >
>>>>> > The variant PR was merged this and last week. There are still some
>>>>> variant testing related PRs, which are probably not blockers for 1.10.0
>>>>> release.
>>>>> > * Spark variant read: https://github.com/apache/iceberg/pull/13219
>>>>> > * use short strings: https://github.com/apache/iceberg/pull/13284
>>>>> >
>>>>> > We are still waiting for the following two changes
>>>>> > * Anton's fix for the data frame join using the same snapshot, which
>>>>> will introduce a slight behavior change in spark 4.0.
>>>>> > * unknown type support. Fokko raised a discussion thread on a
>>>>> blocking issue.
>>>>> >
>>>>> > Anything else did I miss?
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong <[email protected]>
>>>>> wrote:
>>>>> >>
>>>>> >> Hey all,
>>>>> >>
>>>>> >> The read path for the UnknownType needs some community discussion.
>>>>> I've raised a separate thread. PTAL
>>>>> >>
>>>>> >> Kind regards from Belgium,
>>>>> >> Fokko
>>>>> >>
>>>>> >> Op za 26 jul 2025 om 00:58 schreef Ryan Blue <[email protected]>:
>>>>> >>>
>>>>> >>> I thought that we said we wanted to get support out for v3
>>>>> features in this release unless there is some reasonable blocker, like
>>>>> Spark not having geospatial types. To me, I think that means we should aim
>>>>> to get variant and unknown done so that we have a complete implementation
>>>>> with a major engine. And it should not be particularly difficult to get
>>>>> unknown done so I'd opt to get it in.
>>>>> >>>
>>>>> >>> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu <[email protected]>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> > I believe we also wanted to get in at least the read path for
>>>>> UnknownType. Fokko has a WIP PR for that.
>>>>> >>>> I thought in the community sync the consensus is that this is not
>>>>> a blocker, because it is a new feature implementation. If it is ready, it
>>>>> will be included.
>>>>> >>>>
>>>>> >>>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <[email protected]>
>>>>> wrote:
>>>>> >>>>>
>>>>> >>>>> I think Fokko's OOO. Should we help with that PR?
>>>>> >>>>>
>>>>> >>>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner <
>>>>> [email protected]> wrote:
>>>>> >>>>>>
>>>>> >>>>>> I believe we also wanted to get in at least the read path for
>>>>> UnknownType. Fokko has a WIP PR for that.
>>>>> >>>>>>
>>>>> >>>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu <[email protected]>
>>>>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> 3. Spark: fix data frame join based on different versions of
>>>>> the same table that may lead to weird results. Anton is working on a fix.
>>>>> It requires a small behavior change (table state may be stale up to 
>>>>> refresh
>>>>> interval). Hence it is better to include it in the 1.10.0 release where
>>>>> Spark 4.0 is first supported.
>>>>> >>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is
>>>>> very close and will prioritize the review.
>>>>> >>>>>>>
>>>>> >>>>>>> We still have the above two issues pending. 3 doesn't have a
>>>>> PR yet. PR for 4 is not associated with the milestone yet.
>>>>> >>>>>>>
>>>>> >>>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>
>>>>> >>>>>>>> Thanks everyone for the review. The 2 PRs are both merged.
>>>>> >>>>>>>> Looks like there's only 1 PR left in the 1.10 milestone :)
>>>>> >>>>>>>>
>>>>> >>>>>>>> Best,
>>>>> >>>>>>>> Kevin Liu
>>>>> >>>>>>>>
>>>>> >>>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> Thanks Kevin. The first change is not in the versioned doc
>>>>> so it can be released anytime.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> Regards,
>>>>> >>>>>>>>> Manu
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> The 3 PRs above are merged. Thanks everyone for the review.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are both
>>>>> nice-to-haves.
>>>>> >>>>>>>>>> - docs: add subpage for REST Catalog Spec in
>>>>> "Specification" #13521
>>>>> >>>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for rest
>>>>> fixture #13599
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> The first one changes the link for "REST Catalog Spec" on
>>>>> the left nav of https://iceberg.apache.org/spec/ from the swagger.io
>>>>> link to a dedicated page for IRC.
>>>>> >>>>>>>>>> The second one fixes the default behavior of
>>>>> `iceberg-rest-fixture` image to align with the general expectation when
>>>>> creating a table in a catalog.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Please take a look. I would like to have both of these as
>>>>> part of the 1.10 release.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Best,
>>>>> >>>>>>>>>> Kevin Liu
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> Here are the 3 PRs to add corresponding tests.
>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13648
>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13649
>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13650
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for CI
>>>>> to complete :)
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> Best,
>>>>> >>>>>>>>>>> Kevin Liu
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Kevin, thanks for checking that. I will take a look at
>>>>> your backport PRs. Can you add them to the 1.10.0 milestone?
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> Thanks again for driving this Steven! We're very close!!
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> As mentioned in the community sync today, I wanted to
>>>>> verify feature parity between Spark 3.5 and Spark 4.0 for this release.
>>>>> >>>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have
>>>>> feature parity for this upcoming release. More details in the other 
>>>>> devlist
>>>>> thread
>>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> Thanks,
>>>>> >>>>>>>>>>>>> Kevin Liu
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Another update on the release.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> The existing blocker PRs are almost done.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> During today's community sync, we identified the
>>>>> following issues/PRs to be included in the 1.10.0 release.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> backport of PR 13100 to the main branch. I have created
>>>>> a cherry-pick PR for that. There is a one line difference compared to the
>>>>> original PR due to the removal of the deprecated RemoveSnapshot class in
>>>>> main branch for 1.10.0 target. Amogh has suggested using RemoveSnapshots
>>>>> with a single snapshot id, which should be supported by all REST catalog
>>>>> servers.
>>>>> >>>>>>>>>>>>>> Flink compaction doesn't support row lineage. Fail the
>>>>> compaction for V3 tables. I created a PR for that. Will backport after it
>>>>> is merged.
>>>>> >>>>>>>>>>>>>> Spark: fix data frame join based on different versions
>>>>> of the same table that may lead to weird results. Anton is working on a
>>>>> fix. It requires a small behavior change (table state may be stale up to
>>>>> refresh interval). Hence it is better to include it in the 1.10.0 release
>>>>> where Spark 4.0 is first supported.
>>>>> >>>>>>>>>>>>>> Variant support in core and Spark 4.0. Ryan thinks this
>>>>> is very close and will prioritize the review.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Thanks,
>>>>> >>>>>>>>>>>>>> steven
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> The 1.10.0 milestone can be found here.
>>>>> >>>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include the
>>>>> PR in the 1.10.0 milestone.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt
>>>>> <[email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent
>>>>> point of view, we will not be able to publish the connector on Confluent
>>>>> Hub until this CVE[1] is fixed.
>>>>> >>>>>>>>>>>>>>>> Since we would not publish a snapshot build, if the
>>>>> fix doesn't make it into 1.10 then we'd have to wait for 1.11 (or a dot
>>>>> release of 1.10) to be able to include the connector on Confluent Hub.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> Thanks, Robin.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> [1]
>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> I have approached Confluent people to help us
>>>>> publish the OSS Kafka Connect Iceberg sink plugin.
>>>>> >>>>>>>>>>>>>>>>> It seems we have a CVE from dependency that blocks
>>>>> us from publishing the plugin.
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release which
>>>>> fixes that.
>>>>> >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> - Ajantha
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> > Engines may model operations as
>>>>> deleting/inserting rows or as modifications to rows that preserve row ids.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some
>>>>> context. The first half (as deleting/inserting rows) is probably about the
>>>>> row lineage handling with equality deletes, which is described in another
>>>>> place.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows
>>>>> updated via Equality Deletes, because engines using equality deletes avoid
>>>>> reading existing data before writing changes and can't provide the 
>>>>> original
>>>>> row ID for the new rows. These updates are always treated as if the
>>>>> existing row was completely removed and a unique new row was added."
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the
>>>>> following sentence is a bit hard to understand (maybe just me)
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>> Engines may model operations as deleting/inserting
>>>>> rows or as modifications to rows that preserve row ids.
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>> Can you please help to explain?
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>> Steven Wu <[email protected]>于2025年7月15日
>>>>> 周二04:41写道：
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> Manu
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry
>>>>> over (for replace)
>>>>> >>>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different
>>>>> data file for any reason, writers should write _row_id and
>>>>> _last_updated_sequence_number according to the following rules:"
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> Thanks,
>>>>> >>>>>>>>>>>>>>>>>>>> Steven
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> another update on the release.
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0
>>>>> milestone (with 25 closed PRs). Amogh is actively working on the last
>>>>> blocker PR.
>>>>> >>>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on
>>>>> compaction
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the
>>>>> above blocker is merged and backported.
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>> >>>>>>>>>>>>>>>>>>>>> Steven
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang <
>>>>> [email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Amogh,
>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that "replace"
>>>>> operation should carry over existing lineage info insteading of assigning
>>>>> new IDs? If not, we'd better firstly define it in spec because all engines
>>>>> and implementations need to follow it.
>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh
>>>>> Jahagirdar <[email protected]> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure
>>>>> works with row lineage before release is data file compaction. At the
>>>>> moment, it looks like compaction will read the records from the data files
>>>>> without projecting the lineage fields. What this means is that on write of
>>>>> the new compacted data files we'd be losing the lineage information.
>>>>> There's no data change in a compaction but we do need to make sure the
>>>>> lineage info from carried over records is materialized in the newly
>>>>> compacted files so they don't get new IDs or inherit the new file sequence
>>>>> number. I'm working on addressing this as well, but I'd call this out as a
>>>>> blocker as well.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> --
>>>>> >>>>>>>>>>>>>>>> Robin Moffatt
>>>>> >>>>>>>>>>>>>>>> Sr. Principal Advisor, Streaming Data Technologies
>>>>>
>>>>

Re: Iceberg 1.10.0 release update - August 2025

Reply via email to