Re: Iceberg 1.10.0 release update - September 2025

Steven Wu Tue, 02 Sep 2025 20:46:49 -0700

sorry, the PR link for the staging-binaries.sh was wrong (missing a digit).


I thought this PR will fix the issue. Initially, it worked well with a few
runs. But later I am still experiencing the same problem. Suggestions are
appreciated!
https://github.com/apache/iceberg/pull/13958

On Tue, Sep 2, 2025 at 9:51 AM Steven Wu <[email protected]> wrote:

> Hi,
>
> Just to update the community on the status.
>
> Fokko also reached out to include Parquet Java 1.16.0 in this release.
> Vote just passed in the Parquet community. We are waiting for the binary
> release. We will try to include it in the 1.10.0 release. Reviews are
> welcomed.
> https://github.com/apache/iceberg/pull/1394
>
> We also ran into a couple of issues with the release script/process.
>
> 1) staging-binaries.sh has race conditions on concurrent publish and 2
> folders in Maven repo.
>
> I thought this PR will fix the issue. Initially, it worked well with a few
> runs. But later I am still experiencing the same problem. Suggestions are
> appreciated!
> https://github.com/apache/iceberg/pull/13958
>
> 2) Yuya found out that the iceberg-api module wasn't published in the RC2
> staging (1243).
> https://repository.apache.org/content/repositories/orgapacheiceberg-1243/
>
> The first release issue is the more annoying/impacting problem. the second
> release issue is uncommon, as I didn't see it in a few other runs of
> staging-binaries.sh.
>
> Thanks,
> Steven
>
>
>
> On Sun, Aug 31, 2025 at 12:48 PM Steven Wu <[email protected]> wrote:
>
>> I started a vote thread for 1.10.0 RC2.
>>
>> I have to fix a couple of release script issues. Hence the first release
>> candidate is RC2 to vote.
>>
>> On Fri, Aug 29, 2025 at 9:53 AM Kevin Liu <[email protected]> wrote:
>>
>>> Thanks Steven! I did another pass to check for feature parity between
>>> spark 3.5 and spark 4.0 for this release and everything looks good. There
>>> are a few test cases that have not been ported, but we can punt those for
>>> now.
>>>
>>> Best,
>>> Kevin Liu
>>>
>>> On Thu, Aug 28, 2025 at 7:08 PM Steven Wu <[email protected]> wrote:
>>>
>>>> Thanks to Fokko and Ryan, the unknown type support PR was merged today.
>>>>
>>>> Everything in the 1.10.0 milestone is closed now.
>>>>
>>>> I will work on a release candidate next.
>>>>
>>>> On Fri, Aug 8, 2025 at 6:14 AM Fokko Driesprong <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Steven,
>>>>>
>>>>> Thanks for updating this thread.
>>>>>
>>>>> I've updated the UnknownType PR
>>>>> <https://github.com/apache/iceberg/pull/13445> to first block on the
>>>>> complex cases that will require some more discussion. This way we can
>>>>> revisit this also after the 1.10.0 release.
>>>>>
>>>>> Kind regards,
>>>>> Fokko
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Op do 7 aug 2025 om 23:56 schreef Steven Wu <[email protected]>:
>>>>>
>>>>>> edited the subject line as we are into August.
>>>>>>
>>>>>> We are still waiting for the following two changes for the 1.10.0
>>>>>> release
>>>>>> * Anton's fix for the data frame join using the same snapshot, which
>>>>>> will introduce a slight behavior change in spark 4.0.
>>>>>> * unknown type support.
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 1, 2025 at 6:56 AM Alexandre Dutra <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Steven,
>>>>>>>
>>>>>>> A small regression with S3 signing has been reported to me. The fix
>>>>>>> is simple:
>>>>>>>
>>>>>>> https://github.com/apache/iceberg/pull/13718
>>>>>>>
>>>>>>> Would it be still possible to have it in 1.10 please?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Alex
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 31, 2025 at 7:19 PM Steven Wu <[email protected]>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Currently, the 1.10.0 milestone have no open PRs
>>>>>>> > https://github.com/apache/iceberg/milestone/54
>>>>>>> >
>>>>>>> > The variant PR was merged this and last week. There are still some
>>>>>>> variant testing related PRs, which are probably not blockers for 1.10.0
>>>>>>> release.
>>>>>>> > * Spark variant read: https://github.com/apache/iceberg/pull/13219
>>>>>>> > * use short strings: https://github.com/apache/iceberg/pull/13284
>>>>>>> >
>>>>>>> > We are still waiting for the following two changes
>>>>>>> > * Anton's fix for the data frame join using the same snapshot,
>>>>>>> which will introduce a slight behavior change in spark 4.0.
>>>>>>> > * unknown type support. Fokko raised a discussion thread on a
>>>>>>> blocking issue.
>>>>>>> >
>>>>>>> > Anything else did I miss?
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong <[email protected]>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >> Hey all,
>>>>>>> >>
>>>>>>> >> The read path for the UnknownType needs some community
>>>>>>> discussion. I've raised a separate thread. PTAL
>>>>>>> >>
>>>>>>> >> Kind regards from Belgium,
>>>>>>> >> Fokko
>>>>>>> >>
>>>>>>> >> Op za 26 jul 2025 om 00:58 schreef Ryan Blue <[email protected]>:
>>>>>>> >>>
>>>>>>> >>> I thought that we said we wanted to get support out for v3
>>>>>>> features in this release unless there is some reasonable blocker, like
>>>>>>> Spark not having geospatial types. To me, I think that means we should 
>>>>>>> aim
>>>>>>> to get variant and unknown done so that we have a complete 
>>>>>>> implementation
>>>>>>> with a major engine. And it should not be particularly difficult to get
>>>>>>> unknown done so I'd opt to get it in.
>>>>>>> >>>
>>>>>>> >>> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu <[email protected]>
>>>>>>> wrote:
>>>>>>> >>>>
>>>>>>> >>>> > I believe we also wanted to get in at least the read path for
>>>>>>> UnknownType. Fokko has a WIP PR for that.
>>>>>>> >>>> I thought in the community sync the consensus is that this is
>>>>>>> not a blocker, because it is a new feature implementation. If it is 
>>>>>>> ready,
>>>>>>> it will be included.
>>>>>>> >>>>
>>>>>>> >>>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>
>>>>>>> >>>>> I think Fokko's OOO. Should we help with that PR?
>>>>>>> >>>>>
>>>>>>> >>>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>
>>>>>>> >>>>>> I believe we also wanted to get in at least the read path for
>>>>>>> UnknownType. Fokko has a WIP PR for that.
>>>>>>> >>>>>>
>>>>>>> >>>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> 3. Spark: fix data frame join based on different versions of
>>>>>>> the same table that may lead to weird results. Anton is working on a 
>>>>>>> fix.
>>>>>>> It requires a small behavior change (table state may be stale up to 
>>>>>>> refresh
>>>>>>> interval). Hence it is better to include it in the 1.10.0 release where
>>>>>>> Spark 4.0 is first supported.
>>>>>>> >>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this
>>>>>>> is very close and will prioritize the review.
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> We still have the above two issues pending. 3 doesn't have a
>>>>>>> PR yet. PR for 4 is not associated with the milestone yet.
>>>>>>> >>>>>>>
>>>>>>> >>>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> Thanks everyone for the review. The 2 PRs are both merged.
>>>>>>> >>>>>>>> Looks like there's only 1 PR left in the 1.10 milestone :)
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> Best,
>>>>>>> >>>>>>>> Kevin Liu
>>>>>>> >>>>>>>>
>>>>>>> >>>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>> Thanks Kevin. The first change is not in the versioned doc
>>>>>>> so it can be released anytime.
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>> Regards,
>>>>>>> >>>>>>>>> Manu
>>>>>>> >>>>>>>>>
>>>>>>> >>>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> The 3 PRs above are merged. Thanks everyone for the
>>>>>>> review.
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are
>>>>>>> both nice-to-haves.
>>>>>>> >>>>>>>>>> - docs: add subpage for REST Catalog Spec in
>>>>>>> "Specification" #13521
>>>>>>> >>>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for
>>>>>>> rest fixture #13599
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> The first one changes the link for "REST Catalog Spec" on
>>>>>>> the left nav of https://iceberg.apache.org/spec/ from the swagger.io
>>>>>>> link to a dedicated page for IRC.
>>>>>>> >>>>>>>>>> The second one fixes the default behavior of
>>>>>>> `iceberg-rest-fixture` image to align with the general expectation when
>>>>>>> creating a table in a catalog.
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> Please take a look. I would like to have both of these as
>>>>>>> part of the 1.10 release.
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> Best,
>>>>>>> >>>>>>>>>> Kevin Liu
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>>
>>>>>>> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>> Here are the 3 PRs to add corresponding tests.
>>>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13648
>>>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13649
>>>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13650
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for CI
>>>>>>> to complete :)
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>> Best,
>>>>>>> >>>>>>>>>>> Kevin Liu
>>>>>>> >>>>>>>>>>>
>>>>>>> >>>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> Kevin, thanks for checking that. I will take a look at
>>>>>>> your backport PRs. Can you add them to the 1.10.0 milestone?
>>>>>>> >>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>> Thanks again for driving this Steven! We're very
>>>>>>> close!!
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>> As mentioned in the community sync today, I wanted to
>>>>>>> verify feature parity between Spark 3.5 and Spark 4.0 for this release.
>>>>>>> >>>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 have
>>>>>>> feature parity for this upcoming release. More details in the other 
>>>>>>> devlist
>>>>>>> thread
>>>>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>> Thanks,
>>>>>>> >>>>>>>>>>>>> Kevin Liu
>>>>>>> >>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> Another update on the release.
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> The existing blocker PRs are almost done.
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> During today's community sync, we identified the
>>>>>>> following issues/PRs to be included in the 1.10.0 release.
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> backport of PR 13100 to the main branch. I have
>>>>>>> created a cherry-pick PR for that. There is a one line difference 
>>>>>>> compared
>>>>>>> to the original PR due to the removal of the deprecated RemoveSnapshot
>>>>>>> class in main branch for 1.10.0 target. Amogh has suggested using
>>>>>>> RemoveSnapshots with a single snapshot id, which should be supported by 
>>>>>>> all
>>>>>>> REST catalog servers.
>>>>>>> >>>>>>>>>>>>>> Flink compaction doesn't support row lineage. Fail
>>>>>>> the compaction for V3 tables. I created a PR for that. Will backport 
>>>>>>> after
>>>>>>> it is merged.
>>>>>>> >>>>>>>>>>>>>> Spark: fix data frame join based on different
>>>>>>> versions of the same table that may lead to weird results. Anton is 
>>>>>>> working
>>>>>>> on a fix. It requires a small behavior change (table state may be stale 
>>>>>>> up
>>>>>>> to refresh interval). Hence it is better to include it in the 1.10.0
>>>>>>> release where Spark 4.0 is first supported.
>>>>>>> >>>>>>>>>>>>>> Variant support in core and Spark 4.0. Ryan thinks
>>>>>>> this is very close and will prioritize the review.
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> Thanks,
>>>>>>> >>>>>>>>>>>>>> steven
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> The 1.10.0 milestone can be found here.
>>>>>>> >>>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include
>>>>>>> the PR in the 1.10.0 milestone.
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt
>>>>>>> <[email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent
>>>>>>> point of view, we will not be able to publish the connector on Confluent
>>>>>>> Hub until this CVE[1] is fixed.
>>>>>>> >>>>>>>>>>>>>>>> Since we would not publish a snapshot build, if the
>>>>>>> fix doesn't make it into 1.10 then we'd have to wait for 1.11 (or a dot
>>>>>>> release of 1.10) to be able to include the connector on Confluent Hub.
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>> Thanks, Robin.
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>> [1]
>>>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>> I have approached Confluent people to help us
>>>>>>> publish the OSS Kafka Connect Iceberg sink plugin.
>>>>>>> >>>>>>>>>>>>>>>>> It seems we have a CVE from dependency that blocks
>>>>>>> us from publishing the plugin.
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release
>>>>>>> which fixes that.
>>>>>>> >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>> - Ajantha
>>>>>>> >>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>> > Engines may model operations as
>>>>>>> deleting/inserting rows or as modifications to rows that preserve row 
>>>>>>> ids.
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some
>>>>>>> context. The first half (as deleting/inserting rows) is probably about 
>>>>>>> the
>>>>>>> row lineage handling with equality deletes, which is described in 
>>>>>>> another
>>>>>>> place.
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows
>>>>>>> updated via Equality Deletes, because engines using equality deletes 
>>>>>>> avoid
>>>>>>> reading existing data before writing changes and can't provide the 
>>>>>>> original
>>>>>>> row ID for the new rows. These updates are always treated as if the
>>>>>>> existing row was completely removed and a unique new row was added."
>>>>>>> >>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the
>>>>>>> following sentence is a bit hard to understand (maybe just me)
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>> Engines may model operations as
>>>>>>> deleting/inserting rows or as modifications to rows that preserve row 
>>>>>>> ids.
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>> Can you please help to explain?
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>> Steven Wu <[email protected]>于2025年7月15日
>>>>>>> 周二04:41写道：
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>> Manu
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry
>>>>>>> over (for replace)
>>>>>>> >>>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different
>>>>>>> data file for any reason, writers should write _row_id and
>>>>>>> _last_updated_sequence_number according to the following rules:"
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>> >>>>>>>>>>>>>>>>>>>> Steven
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>> another update on the release.
>>>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0
>>>>>>> milestone (with 25 closed PRs). Amogh is actively working on the last
>>>>>>> blocker PR.
>>>>>>> >>>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information on
>>>>>>> compaction
>>>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the
>>>>>>> above blocker is merged and backported.
>>>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>> >>>>>>>>>>>>>>>>>>>>> Steven
>>>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang <
>>>>>>> [email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Amogh,
>>>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that
>>>>>>> "replace" operation should carry over existing lineage info insteading 
>>>>>>> of
>>>>>>> assigning new IDs? If not, we'd better firstly define it in spec because
>>>>>>> all engines and implementations need to follow it.
>>>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh
>>>>>>> Jahagirdar <[email protected]> wrote:
>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure
>>>>>>> works with row lineage before release is data file compaction. At the
>>>>>>> moment, it looks like compaction will read the records from the data 
>>>>>>> files
>>>>>>> without projecting the lineage fields. What this means is that on write 
>>>>>>> of
>>>>>>> the new compacted data files we'd be losing the lineage information.
>>>>>>> There's no data change in a compaction but we do need to make sure the
>>>>>>> lineage info from carried over records is materialized in the newly
>>>>>>> compacted files so they don't get new IDs or inherit the new file 
>>>>>>> sequence
>>>>>>> number. I'm working on addressing this as well, but I'd call this out 
>>>>>>> as a
>>>>>>> blocker as well.
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>>
>>>>>>> >>>>>>>>>>>>>>>> --
>>>>>>> >>>>>>>>>>>>>>>> Robin Moffatt
>>>>>>> >>>>>>>>>>>>>>>> Sr. Principal Advisor, Streaming Data Technologies
>>>>>>>
>>>>>>

Re: Iceberg 1.10.0 release update - September 2025

Reply via email to