> I believe we also wanted to get in at least the read path for UnknownType. Fokko has a WIP PR <https://github.com/apache/iceberg/pull/13445> for that. I thought in the community sync the consensus is that this is not a blocker, because it is a new feature implementation. If it is ready, it will be included.
On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <kevinjq...@apache.org> wrote: > I think Fokko's OOO. Should we help with that PR? > > On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner < > etudenhoef...@apache.org> wrote: > >> I believe we also wanted to get in at least the read path for >> UnknownType. Fokko has a WIP PR >> <https://github.com/apache/iceberg/pull/13445> for that. >> >> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu <stevenz...@gmail.com> wrote: >> >>> 3. Spark: fix data frame join based on different versions of the same >>> table that may lead to weird results. Anton is working on a fix. It >>> requires a small behavior change (table state may be stale up to refresh >>> interval). Hence it is better to include it in the 1.10.0 release where >>> Spark 4.0 is first supported. >>> 4. Variant support in core and Spark 4.0. Ryan thinks this is very close >>> and will prioritize the review. >>> >>> We still have the above two issues pending. 3 doesn't have a PR yet. PR >>> for 4 is not associated with the milestone yet. >>> >>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu <kevinjq...@apache.org> wrote: >>> >>>> Thanks everyone for the review. The 2 PRs are both merged. >>>> Looks like there's only 1 PR left in the 1.10 milestone >>>> <https://github.com/apache/iceberg/milestone/54> :) >>>> >>>> Best, >>>> Kevin Liu >>>> >>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <owenzhang1...@gmail.com> >>>> wrote: >>>> >>>>> Thanks Kevin. The first change is not in the versioned doc so it can >>>>> be released anytime. >>>>> >>>>> Regards, >>>>> Manu >>>>> >>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <kevinjq...@apache.org> >>>>> wrote: >>>>> >>>>>> The 3 PRs above are merged. Thanks everyone for the review. >>>>>> >>>>>> I've added 2 more PRs to the 1.10 milestone. These are both >>>>>> nice-to-haves. >>>>>> - docs: add subpage for REST Catalog Spec in "Specification" #13521 >>>>>> <https://github.com/apache/iceberg/pull/13521> >>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for rest fixture >>>>>> #13599 <https://github.com/apache/iceberg/pull/13599> >>>>>> >>>>>> The first one changes the link for "REST Catalog Spec" on the left >>>>>> nav of https://iceberg.apache.org/spec/ from the swagger.io link to >>>>>> a dedicated page for IRC. >>>>>> The second one fixes the default behavior of `iceberg-rest-fixture` >>>>>> image to align with the general expectation when creating a table in a >>>>>> catalog. >>>>>> >>>>>> Please take a look. I would like to have both of these as part of the >>>>>> 1.10 release. >>>>>> >>>>>> Best, >>>>>> Kevin Liu >>>>>> >>>>>> >>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <kevinjq...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> Here are the 3 PRs to add corresponding tests. >>>>>>> https://github.com/apache/iceberg/pull/13648 >>>>>>> https://github.com/apache/iceberg/pull/13649 >>>>>>> https://github.com/apache/iceberg/pull/13650 >>>>>>> >>>>>>> I've tagged them with the 1.10 milestone, waiting for CI to complete >>>>>>> :) >>>>>>> >>>>>>> Best, >>>>>>> Kevin Liu >>>>>>> >>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <stevenz...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Kevin, thanks for checking that. I will take a look at your >>>>>>>> backport PRs. Can you add them to the 1.10.0 milestone? >>>>>>>> >>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <kevinjq...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks again for driving this Steven! We're very close!! >>>>>>>>> >>>>>>>>> As mentioned in the community sync today, I wanted to verify >>>>>>>>> feature parity between Spark 3.5 and Spark 4.0 for this release. >>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have feature >>>>>>>>> parity for this upcoming release. More details in the other devlist >>>>>>>>> thread >>>>>>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Kevin Liu >>>>>>>>> >>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <stevenz...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Another update on the release. >>>>>>>>>> >>>>>>>>>> The existing blocker PRs are almost done. >>>>>>>>>> >>>>>>>>>> During today's community sync, we identified the following >>>>>>>>>> issues/PRs to be included in the 1.10.0 release. >>>>>>>>>> >>>>>>>>>> 1. backport of PR 13100 to the main branch. I have created a >>>>>>>>>> cherry-pick >>>>>>>>>> PR <https://github.com/apache/iceberg/pull/13647> for that. >>>>>>>>>> There is a one line difference compared to the original PR due to >>>>>>>>>> the >>>>>>>>>> removal of the deprecated RemoveSnapshot class in main branch for >>>>>>>>>> 1.10.0 >>>>>>>>>> target. Amogh has suggested using RemoveSnapshots with a single >>>>>>>>>> snapshot >>>>>>>>>> id, which should be supported by all REST catalog servers. >>>>>>>>>> 2. Flink compaction doesn't support row lineage. Fail the >>>>>>>>>> compaction for V3 tables. I created a PR >>>>>>>>>> <https://github.com/apache/iceberg/pull/13646> for that. Will >>>>>>>>>> backport after it is merged. >>>>>>>>>> 3. Spark: fix data frame join based on different versions of >>>>>>>>>> the same table that may lead to weird results. Anton is working >>>>>>>>>> on a fix. >>>>>>>>>> It requires a small behavior change (table state may be stale up >>>>>>>>>> to refresh >>>>>>>>>> interval). Hence it is better to include it in the 1.10.0 release >>>>>>>>>> where >>>>>>>>>> Spark 4.0 is first supported. >>>>>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is >>>>>>>>>> very close and will prioritize the review. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> steven >>>>>>>>>> >>>>>>>>>> The 1.10.0 milestone can be found here. >>>>>>>>>> https://github.com/apache/iceberg/milestone/54 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <stevenz...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include the PR in the >>>>>>>>>>> 1.10.0 milestone. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt >>>>>>>>>>> <ro...@confluent.io.invalid> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent point of >>>>>>>>>>>> view, we will not be able to publish the connector on Confluent >>>>>>>>>>>> Hub until >>>>>>>>>>>> this CVE[1] is fixed. >>>>>>>>>>>> Since we would not publish a snapshot build, if the fix doesn't >>>>>>>>>>>> make it into 1.10 then we'd have to wait for 1.11 (or a dot >>>>>>>>>>>> release of >>>>>>>>>>>> 1.10) to be able to include the connector on Confluent Hub. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, Robin. >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861 >>>>>>>>>>>> >>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat < >>>>>>>>>>>> ajanthab...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I have approached Confluent people >>>>>>>>>>>>> <https://github.com/apache/iceberg/issues/10745#issuecomment-3058281281> >>>>>>>>>>>>> to help us publish the OSS Kafka Connect Iceberg sink plugin. >>>>>>>>>>>>> It seems we have a CVE from dependency that blocks us from >>>>>>>>>>>>> publishing the plugin. >>>>>>>>>>>>> >>>>>>>>>>>>> Please include the below PR for 1.10.0 release which fixes >>>>>>>>>>>>> that. >>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561 >>>>>>>>>>>>> >>>>>>>>>>>>> - Ajantha >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu < >>>>>>>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> > Engines may model operations as deleting/inserting rows or >>>>>>>>>>>>>> as modifications to rows that preserve row ids. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some context. The >>>>>>>>>>>>>> first half (as deleting/inserting rows) is probably about >>>>>>>>>>>>>> the row lineage handling with equality deletes, which is >>>>>>>>>>>>>> described in >>>>>>>>>>>>>> another place. >>>>>>>>>>>>>> >>>>>>>>>>>>>> "Row lineage does not track lineage for rows updated via Equality >>>>>>>>>>>>>> Deletes >>>>>>>>>>>>>> <https://iceberg.apache.org/spec/#equality-delete-files>, >>>>>>>>>>>>>> because engines using equality deletes avoid reading existing >>>>>>>>>>>>>> data before >>>>>>>>>>>>>> writing changes and can't provide the original row ID for the >>>>>>>>>>>>>> new rows. >>>>>>>>>>>>>> These updates are always treated as if the existing row was >>>>>>>>>>>>>> completely >>>>>>>>>>>>>> removed and a unique new row was added." >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang < >>>>>>>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks Steven, I missed that part but the following sentence >>>>>>>>>>>>>>> is a bit hard to understand (maybe just me) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Engines may model operations as deleting/inserting rows or >>>>>>>>>>>>>>> as modifications to rows that preserve row ids. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you please help to explain? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日 周二04:41写道: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Manu >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The spec already covers the row lineage carry over (for >>>>>>>>>>>>>>>> replace) >>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> "When an existing row is moved to a different data file >>>>>>>>>>>>>>>> for any reason, writers should write _row_id and >>>>>>>>>>>>>>>> _last_updated_sequence_number according to the following >>>>>>>>>>>>>>>> rules:" >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu < >>>>>>>>>>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> another update on the release. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0 milestone >>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/milestone/54> (with 25 >>>>>>>>>>>>>>>>> closed PRs). Amogh is actively working on the last blocker PR. >>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on compaction >>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/pull/13555> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I will publish a release candidate after the above blocker >>>>>>>>>>>>>>>>> is merged and backported. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang < >>>>>>>>>>>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Amogh, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Is it defined in the table spec that "replace" operation >>>>>>>>>>>>>>>>>> should carry over existing lineage info insteading of >>>>>>>>>>>>>>>>>> assigning new IDs? If >>>>>>>>>>>>>>>>>> not, we'd better firstly define it in spec because all >>>>>>>>>>>>>>>>>> engines and >>>>>>>>>>>>>>>>>> implementations need to follow it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh Jahagirdar < >>>>>>>>>>>>>>>>>> 2am...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> One other area I think we need to make sure works with >>>>>>>>>>>>>>>>>>> row lineage before release is data file compaction. At >>>>>>>>>>>>>>>>>>> the moment, >>>>>>>>>>>>>>>>>>> <https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/SparkBinPackFileRewriteRunner.java#L44> >>>>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>> looks like compaction will read the records from the data >>>>>>>>>>>>>>>>>>> files without >>>>>>>>>>>>>>>>>>> projecting the lineage fields. What this means is that on >>>>>>>>>>>>>>>>>>> write of the new >>>>>>>>>>>>>>>>>>> compacted data files we'd be losing the lineage >>>>>>>>>>>>>>>>>>> information. There's no >>>>>>>>>>>>>>>>>>> data change in a compaction but we do need to make sure the >>>>>>>>>>>>>>>>>>> lineage info >>>>>>>>>>>>>>>>>>> from carried over records is materialized in the newly >>>>>>>>>>>>>>>>>>> compacted files so >>>>>>>>>>>>>>>>>>> they don't get new IDs or inherit the new file sequence >>>>>>>>>>>>>>>>>>> number. I'm working >>>>>>>>>>>>>>>>>>> on addressing this as well, but I'd call this out as a >>>>>>>>>>>>>>>>>>> blocker as well. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> *Robin Moffatt* >>>>>>>>>>>> *Sr. Principal Advisor, Streaming Data Technologies* >>>>>>>>>>>> >>>>>>>>>>>