Thanks to Fokko and Ryan, the unknown type support PR was merged today.

Everything in the 1.10.0 milestone is closed now.

I will work on a release candidate next.

On Fri, Aug 8, 2025 at 6:14 AM Fokko Driesprong <fo...@apache.org> wrote:

> Hi Steven,
>
> Thanks for updating this thread.
>
> I've updated the UnknownType PR
> <https://github.com/apache/iceberg/pull/13445> to first block on the
> complex cases that will require some more discussion. This way we can
> revisit this also after the 1.10.0 release.
>
> Kind regards,
> Fokko
>
>
>
>
> Op do 7 aug 2025 om 23:56 schreef Steven Wu <stevenz...@gmail.com>:
>
>> edited the subject line as we are into August.
>>
>> We are still waiting for the following two changes for the 1.10.0 release
>> * Anton's fix for the data frame join using the same snapshot, which will
>> introduce a slight behavior change in spark 4.0.
>> * unknown type support.
>>
>>
>> On Fri, Aug 1, 2025 at 6:56 AM Alexandre Dutra <adu...@apache.org> wrote:
>>
>>> Hi Steven,
>>>
>>> A small regression with S3 signing has been reported to me. The fix is
>>> simple:
>>>
>>> https://github.com/apache/iceberg/pull/13718
>>>
>>> Would it be still possible to have it in 1.10 please?
>>>
>>> Thanks,
>>> Alex
>>>
>>>
>>> On Thu, Jul 31, 2025 at 7:19 PM Steven Wu <stevenz...@gmail.com> wrote:
>>> >
>>> > Currently, the 1.10.0 milestone have no open PRs
>>> > https://github.com/apache/iceberg/milestone/54
>>> >
>>> > The variant PR was merged this and last week. There are still some
>>> variant testing related PRs, which are probably not blockers for 1.10.0
>>> release.
>>> > * Spark variant read: https://github.com/apache/iceberg/pull/13219
>>> > * use short strings: https://github.com/apache/iceberg/pull/13284
>>> >
>>> > We are still waiting for the following two changes
>>> > * Anton's fix for the data frame join using the same snapshot, which
>>> will introduce a slight behavior change in spark 4.0.
>>> > * unknown type support. Fokko raised a discussion thread on a blocking
>>> issue.
>>> >
>>> > Anything else did I miss?
>>> >
>>> >
>>> >
>>> > On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong <fo...@apache.org>
>>> wrote:
>>> >>
>>> >> Hey all,
>>> >>
>>> >> The read path for the UnknownType needs some community discussion.
>>> I've raised a separate thread. PTAL
>>> >>
>>> >> Kind regards from Belgium,
>>> >> Fokko
>>> >>
>>> >> Op za 26 jul 2025 om 00:58 schreef Ryan Blue <rdb...@gmail.com>:
>>> >>>
>>> >>> I thought that we said we wanted to get support out for v3 features
>>> in this release unless there is some reasonable blocker, like Spark not
>>> having geospatial types. To me, I think that means we should aim to get
>>> variant and unknown done so that we have a complete implementation with a
>>> major engine. And it should not be particularly difficult to get unknown
>>> done so I'd opt to get it in.
>>> >>>
>>> >>> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu <stevenz...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>> > I believe we also wanted to get in at least the read path for
>>> UnknownType. Fokko has a WIP PR for that.
>>> >>>> I thought in the community sync the consensus is that this is not a
>>> blocker, because it is a new feature implementation. If it is ready, it
>>> will be included.
>>> >>>>
>>> >>>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <kevinjq...@apache.org>
>>> wrote:
>>> >>>>>
>>> >>>>> I think Fokko's OOO. Should we help with that PR?
>>> >>>>>
>>> >>>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner <
>>> etudenhoef...@apache.org> wrote:
>>> >>>>>>
>>> >>>>>> I believe we also wanted to get in at least the read path for
>>> UnknownType. Fokko has a WIP PR for that.
>>> >>>>>>
>>> >>>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu <stevenz...@gmail.com>
>>> wrote:
>>> >>>>>>>
>>> >>>>>>> 3. Spark: fix data frame join based on different versions of the
>>> same table that may lead to weird results. Anton is working on a fix. It
>>> requires a small behavior change (table state may be stale up to refresh
>>> interval). Hence it is better to include it in the 1.10.0 release where
>>> Spark 4.0 is first supported.
>>> >>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is
>>> very close and will prioritize the review.
>>> >>>>>>>
>>> >>>>>>> We still have the above two issues pending. 3 doesn't have a PR
>>> yet. PR for 4 is not associated with the milestone yet.
>>> >>>>>>>
>>> >>>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu <kevinjq...@apache.org>
>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>> Thanks everyone for the review. The 2 PRs are both merged.
>>> >>>>>>>> Looks like there's only 1 PR left in the 1.10 milestone :)
>>> >>>>>>>>
>>> >>>>>>>> Best,
>>> >>>>>>>> Kevin Liu
>>> >>>>>>>>
>>> >>>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <
>>> owenzhang1...@gmail.com> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>> Thanks Kevin. The first change is not in the versioned doc so
>>> it can be released anytime.
>>> >>>>>>>>>
>>> >>>>>>>>> Regards,
>>> >>>>>>>>> Manu
>>> >>>>>>>>>
>>> >>>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <
>>> kevinjq...@apache.org> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>> The 3 PRs above are merged. Thanks everyone for the review.
>>> >>>>>>>>>>
>>> >>>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are both
>>> nice-to-haves.
>>> >>>>>>>>>> - docs: add subpage for REST Catalog Spec in "Specification"
>>> #13521
>>> >>>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for rest
>>> fixture #13599
>>> >>>>>>>>>>
>>> >>>>>>>>>> The first one changes the link for "REST Catalog Spec" on the
>>> left nav of https://iceberg.apache.org/spec/ from the swagger.io link
>>> to a dedicated page for IRC.
>>> >>>>>>>>>> The second one fixes the default behavior of
>>> `iceberg-rest-fixture` image to align with the general expectation when
>>> creating a table in a catalog.
>>> >>>>>>>>>>
>>> >>>>>>>>>> Please take a look. I would like to have both of these as
>>> part of the 1.10 release.
>>> >>>>>>>>>>
>>> >>>>>>>>>> Best,
>>> >>>>>>>>>> Kevin Liu
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <
>>> kevinjq...@apache.org> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Here are the 3 PRs to add corresponding tests.
>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13648
>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13649
>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13650
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for CI to
>>> complete :)
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Best,
>>> >>>>>>>>>>> Kevin Liu
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <
>>> stevenz...@gmail.com> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Kevin, thanks for checking that. I will take a look at your
>>> backport PRs. Can you add them to the 1.10.0 milestone?
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <
>>> kevinjq...@apache.org> wrote:
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Thanks again for driving this Steven! We're very close!!
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> As mentioned in the community sync today, I wanted to
>>> verify feature parity between Spark 3.5 and Spark 4.0 for this release.
>>> >>>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have
>>> feature parity for this upcoming release. More details in the other devlist
>>> thread https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Thanks,
>>> >>>>>>>>>>>>> Kevin Liu
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <
>>> stevenz...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Another update on the release.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> The existing blocker PRs are almost done.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> During today's community sync, we identified the
>>> following issues/PRs to be included in the 1.10.0 release.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> backport of PR 13100 to the main branch. I have created a
>>> cherry-pick PR for that. There is a one line difference compared to the
>>> original PR due to the removal of the deprecated RemoveSnapshot class in
>>> main branch for 1.10.0 target. Amogh has suggested using RemoveSnapshots
>>> with a single snapshot id, which should be supported by all REST catalog
>>> servers.
>>> >>>>>>>>>>>>>> Flink compaction doesn't support row lineage. Fail the
>>> compaction for V3 tables. I created a PR for that. Will backport after it
>>> is merged.
>>> >>>>>>>>>>>>>> Spark: fix data frame join based on different versions of
>>> the same table that may lead to weird results. Anton is working on a fix.
>>> It requires a small behavior change (table state may be stale up to refresh
>>> interval). Hence it is better to include it in the 1.10.0 release where
>>> Spark 4.0 is first supported.
>>> >>>>>>>>>>>>>> Variant support in core and Spark 4.0. Ryan thinks this
>>> is very close and will prioritize the review.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Thanks,
>>> >>>>>>>>>>>>>> steven
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> The 1.10.0 milestone can be found here.
>>> >>>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <
>>> stevenz...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include the
>>> PR in the 1.10.0 milestone.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt
>>> <ro...@confluent.io.invalid> wrote:
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent point
>>> of view, we will not be able to publish the connector on Confluent Hub
>>> until this CVE[1] is fixed.
>>> >>>>>>>>>>>>>>>> Since we would not publish a snapshot build, if the fix
>>> doesn't make it into 1.10 then we'd have to wait for 1.11 (or a dot release
>>> of 1.10) to be able to include the connector on Confluent Hub.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> Thanks, Robin.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> [1]
>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat <
>>> ajanthab...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> I have approached Confluent people to help us publish
>>> the OSS Kafka Connect Iceberg sink plugin.
>>> >>>>>>>>>>>>>>>>> It seems we have a CVE from dependency that blocks us
>>> from publishing the plugin.
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release which
>>> fixes that.
>>> >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> - Ajantha
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu <
>>> stevenz...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> > Engines may model operations as deleting/inserting
>>> rows or as modifications to rows that preserve row ids.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some
>>> context. The first half (as deleting/inserting rows) is probably about the
>>> row lineage handling with equality deletes, which is described in another
>>> place.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows updated
>>> via Equality Deletes, because engines using equality deletes avoid reading
>>> existing data before writing changes and can't provide the original row ID
>>> for the new rows. These updates are always treated as if the existing row
>>> was completely removed and a unique new row was added."
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang <
>>> owenzhang1...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the following
>>> sentence is a bit hard to understand (maybe just me)
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> Engines may model operations as deleting/inserting
>>> rows or as modifications to rows that preserve row ids.
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> Can you please help to explain?
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日
>>> 周二04:41写道:
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> Manu
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry over
>>> (for replace)
>>> >>>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different data
>>> file for any reason, writers should write _row_id and
>>> _last_updated_sequence_number according to the following rules:"
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> Thanks,
>>> >>>>>>>>>>>>>>>>>>>> Steven
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu <
>>> stevenz...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> another update on the release.
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0 milestone
>>> (with 25 closed PRs). Amogh is actively working on the last blocker PR.
>>> >>>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on
>>> compaction
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the above
>>> blocker is merged and backported.
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> Thanks,
>>> >>>>>>>>>>>>>>>>>>>>> Steven
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang <
>>> owenzhang1...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> Hi Amogh,
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that "replace"
>>> operation should carry over existing lineage info insteading of assigning
>>> new IDs? If not, we'd better firstly define it in spec because all engines
>>> and implementations need to follow it.
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh Jahagirdar <
>>> 2am...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure
>>> works with row lineage before release is data file compaction. At the
>>> moment, it looks like compaction will read the records from the data files
>>> without projecting the lineage fields. What this means is that on write of
>>> the new compacted data files we'd be losing the lineage information.
>>> There's no data change in a compaction but we do need to make sure the
>>> lineage info from carried over records is materialized in the newly
>>> compacted files so they don't get new IDs or inherit the new file sequence
>>> number. I'm working on addressing this as well, but I'd call this out as a
>>> blocker as well.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> --
>>> >>>>>>>>>>>>>>>> Robin Moffatt
>>> >>>>>>>>>>>>>>>> Sr. Principal Advisor, Streaming Data Technologies
>>>
>>

Reply via email to