Hi Steven, A small regression with S3 signing has been reported to me. The fix is simple:
https://github.com/apache/iceberg/pull/13718 Would it be still possible to have it in 1.10 please? Thanks, Alex On Thu, Jul 31, 2025 at 7:19 PM Steven Wu <stevenz...@gmail.com> wrote: > > Currently, the 1.10.0 milestone have no open PRs > https://github.com/apache/iceberg/milestone/54 > > The variant PR was merged this and last week. There are still some variant > testing related PRs, which are probably not blockers for 1.10.0 release. > * Spark variant read: https://github.com/apache/iceberg/pull/13219 > * use short strings: https://github.com/apache/iceberg/pull/13284 > > We are still waiting for the following two changes > * Anton's fix for the data frame join using the same snapshot, which will > introduce a slight behavior change in spark 4.0. > * unknown type support. Fokko raised a discussion thread on a blocking issue. > > Anything else did I miss? > > > > On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong <fo...@apache.org> wrote: >> >> Hey all, >> >> The read path for the UnknownType needs some community discussion. I've >> raised a separate thread. PTAL >> >> Kind regards from Belgium, >> Fokko >> >> Op za 26 jul 2025 om 00:58 schreef Ryan Blue <rdb...@gmail.com>: >>> >>> I thought that we said we wanted to get support out for v3 features in this >>> release unless there is some reasonable blocker, like Spark not having >>> geospatial types. To me, I think that means we should aim to get variant >>> and unknown done so that we have a complete implementation with a major >>> engine. And it should not be particularly difficult to get unknown done so >>> I'd opt to get it in. >>> >>> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu <stevenz...@gmail.com> wrote: >>>> >>>> > I believe we also wanted to get in at least the read path for >>>> > UnknownType. Fokko has a WIP PR for that. >>>> I thought in the community sync the consensus is that this is not a >>>> blocker, because it is a new feature implementation. If it is ready, it >>>> will be included. >>>> >>>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <kevinjq...@apache.org> wrote: >>>>> >>>>> I think Fokko's OOO. Should we help with that PR? >>>>> >>>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner >>>>> <etudenhoef...@apache.org> wrote: >>>>>> >>>>>> I believe we also wanted to get in at least the read path for >>>>>> UnknownType. Fokko has a WIP PR for that. >>>>>> >>>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu <stevenz...@gmail.com> wrote: >>>>>>> >>>>>>> 3. Spark: fix data frame join based on different versions of the same >>>>>>> table that may lead to weird results. Anton is working on a fix. It >>>>>>> requires a small behavior change (table state may be stale up to >>>>>>> refresh interval). Hence it is better to include it in the 1.10.0 >>>>>>> release where Spark 4.0 is first supported. >>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this is very >>>>>>> close and will prioritize the review. >>>>>>> >>>>>>> We still have the above two issues pending. 3 doesn't have a PR yet. PR >>>>>>> for 4 is not associated with the milestone yet. >>>>>>> >>>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu <kevinjq...@apache.org> wrote: >>>>>>>> >>>>>>>> Thanks everyone for the review. The 2 PRs are both merged. >>>>>>>> Looks like there's only 1 PR left in the 1.10 milestone :) >>>>>>>> >>>>>>>> Best, >>>>>>>> Kevin Liu >>>>>>>> >>>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <owenzhang1...@gmail.com> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Thanks Kevin. The first change is not in the versioned doc so it can >>>>>>>>> be released anytime. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Manu >>>>>>>>> >>>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <kevinjq...@apache.org> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> The 3 PRs above are merged. Thanks everyone for the review. >>>>>>>>>> >>>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are both >>>>>>>>>> nice-to-haves. >>>>>>>>>> - docs: add subpage for REST Catalog Spec in "Specification" #13521 >>>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for rest fixture >>>>>>>>>> #13599 >>>>>>>>>> >>>>>>>>>> The first one changes the link for "REST Catalog Spec" on the left >>>>>>>>>> nav of https://iceberg.apache.org/spec/ from the swagger.io link to >>>>>>>>>> a dedicated page for IRC. >>>>>>>>>> The second one fixes the default behavior of `iceberg-rest-fixture` >>>>>>>>>> image to align with the general expectation when creating a table in >>>>>>>>>> a catalog. >>>>>>>>>> >>>>>>>>>> Please take a look. I would like to have both of these as part of >>>>>>>>>> the 1.10 release. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Kevin Liu >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <kevinjq...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Here are the 3 PRs to add corresponding tests. >>>>>>>>>>> https://github.com/apache/iceberg/pull/13648 >>>>>>>>>>> https://github.com/apache/iceberg/pull/13649 >>>>>>>>>>> https://github.com/apache/iceberg/pull/13650 >>>>>>>>>>> >>>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for CI to >>>>>>>>>>> complete :) >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Kevin Liu >>>>>>>>>>> >>>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <stevenz...@gmail.com> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Kevin, thanks for checking that. I will take a look at your >>>>>>>>>>>> backport PRs. Can you add them to the 1.10.0 milestone? >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <kevinjq...@apache.org> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks again for driving this Steven! We're very close!! >>>>>>>>>>>>> >>>>>>>>>>>>> As mentioned in the community sync today, I wanted to verify >>>>>>>>>>>>> feature parity between Spark 3.5 and Spark 4.0 for this release. >>>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have feature >>>>>>>>>>>>> parity for this upcoming release. More details in the other >>>>>>>>>>>>> devlist thread >>>>>>>>>>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Kevin Liu >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <stevenz...@gmail.com> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Another update on the release. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The existing blocker PRs are almost done. >>>>>>>>>>>>>> >>>>>>>>>>>>>> During today's community sync, we identified the following >>>>>>>>>>>>>> issues/PRs to be included in the 1.10.0 release. >>>>>>>>>>>>>> >>>>>>>>>>>>>> backport of PR 13100 to the main branch. I have created a >>>>>>>>>>>>>> cherry-pick PR for that. There is a one line difference compared >>>>>>>>>>>>>> to the original PR due to the removal of the deprecated >>>>>>>>>>>>>> RemoveSnapshot class in main branch for 1.10.0 target. Amogh has >>>>>>>>>>>>>> suggested using RemoveSnapshots with a single snapshot id, which >>>>>>>>>>>>>> should be supported by all REST catalog servers. >>>>>>>>>>>>>> Flink compaction doesn't support row lineage. Fail the >>>>>>>>>>>>>> compaction for V3 tables. I created a PR for that. Will backport >>>>>>>>>>>>>> after it is merged. >>>>>>>>>>>>>> Spark: fix data frame join based on different versions of the >>>>>>>>>>>>>> same table that may lead to weird results. Anton is working on a >>>>>>>>>>>>>> fix. It requires a small behavior change (table state may be >>>>>>>>>>>>>> stale up to refresh interval). Hence it is better to include it >>>>>>>>>>>>>> in the 1.10.0 release where Spark 4.0 is first supported. >>>>>>>>>>>>>> Variant support in core and Spark 4.0. Ryan thinks this is very >>>>>>>>>>>>>> close and will prioritize the review. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> steven >>>>>>>>>>>>>> >>>>>>>>>>>>>> The 1.10.0 milestone can be found here. >>>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <stevenz...@gmail.com> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include the PR in >>>>>>>>>>>>>>> the 1.10.0 milestone. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt >>>>>>>>>>>>>>> <ro...@confluent.io.invalid> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent point of >>>>>>>>>>>>>>>> view, we will not be able to publish the connector on >>>>>>>>>>>>>>>> Confluent Hub until this CVE[1] is fixed. >>>>>>>>>>>>>>>> Since we would not publish a snapshot build, if the fix >>>>>>>>>>>>>>>> doesn't make it into 1.10 then we'd have to wait for 1.11 (or >>>>>>>>>>>>>>>> a dot release of 1.10) to be able to include the connector on >>>>>>>>>>>>>>>> Confluent Hub. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, Robin. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat >>>>>>>>>>>>>>>> <ajanthab...@gmail.com> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I have approached Confluent people to help us publish the OSS >>>>>>>>>>>>>>>>> Kafka Connect Iceberg sink plugin. >>>>>>>>>>>>>>>>> It seems we have a CVE from dependency that blocks us from >>>>>>>>>>>>>>>>> publishing the plugin. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release which fixes >>>>>>>>>>>>>>>>> that. >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Ajantha >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu >>>>>>>>>>>>>>>>> <stevenz...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> > Engines may model operations as deleting/inserting rows or >>>>>>>>>>>>>>>>>> > as modifications to rows that preserve row ids. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some context. The >>>>>>>>>>>>>>>>>> first half (as deleting/inserting rows) is probably about >>>>>>>>>>>>>>>>>> the row lineage handling with equality deletes, which is >>>>>>>>>>>>>>>>>> described in another place. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows updated via >>>>>>>>>>>>>>>>>> Equality Deletes, because engines using equality deletes >>>>>>>>>>>>>>>>>> avoid reading existing data before writing changes and can't >>>>>>>>>>>>>>>>>> provide the original row ID for the new rows. These updates >>>>>>>>>>>>>>>>>> are always treated as if the existing row was completely >>>>>>>>>>>>>>>>>> removed and a unique new row was added." >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang >>>>>>>>>>>>>>>>>> <owenzhang1...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the following >>>>>>>>>>>>>>>>>>> sentence is a bit hard to understand (maybe just me) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Engines may model operations as deleting/inserting rows or >>>>>>>>>>>>>>>>>>> as modifications to rows that preserve row ids. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Can you please help to explain? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日 周二04:41写道: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Manu >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry over (for >>>>>>>>>>>>>>>>>>>> replace) >>>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different data file >>>>>>>>>>>>>>>>>>>> for any reason, writers should write _row_id and >>>>>>>>>>>>>>>>>>>> _last_updated_sequence_number according to the following >>>>>>>>>>>>>>>>>>>> rules:" >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu >>>>>>>>>>>>>>>>>>>> <stevenz...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> another update on the release. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0 milestone (with >>>>>>>>>>>>>>>>>>>>> 25 closed PRs). Amogh is actively working on the last >>>>>>>>>>>>>>>>>>>>> blocker PR. >>>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on compaction >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the above >>>>>>>>>>>>>>>>>>>>> blocker is merged and backported. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang >>>>>>>>>>>>>>>>>>>>> <owenzhang1...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Amogh, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that "replace" operation >>>>>>>>>>>>>>>>>>>>>> should carry over existing lineage info insteading of >>>>>>>>>>>>>>>>>>>>>> assigning new IDs? If not, we'd better firstly define it >>>>>>>>>>>>>>>>>>>>>> in spec because all engines and implementations need to >>>>>>>>>>>>>>>>>>>>>> follow it. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh Jahagirdar >>>>>>>>>>>>>>>>>>>>>> <2am...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure works with >>>>>>>>>>>>>>>>>>>>>>> row lineage before release is data file compaction. At >>>>>>>>>>>>>>>>>>>>>>> the moment, it looks like compaction will read the >>>>>>>>>>>>>>>>>>>>>>> records from the data files without projecting the >>>>>>>>>>>>>>>>>>>>>>> lineage fields. What this means is that on write of the >>>>>>>>>>>>>>>>>>>>>>> new compacted data files we'd be losing the lineage >>>>>>>>>>>>>>>>>>>>>>> information. There's no data change in a compaction but >>>>>>>>>>>>>>>>>>>>>>> we do need to make sure the lineage info from carried >>>>>>>>>>>>>>>>>>>>>>> over records is materialized in the newly compacted >>>>>>>>>>>>>>>>>>>>>>> files so they don't get new IDs or inherit the new file >>>>>>>>>>>>>>>>>>>>>>> sequence number. I'm working on addressing this as >>>>>>>>>>>>>>>>>>>>>>> well, but I'd call this out as a blocker as well. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> Robin Moffatt >>>>>>>>>>>>>>>> Sr. Principal Advisor, Streaming Data Technologies