actually, the Parquet 1.16.0 has the wrong link https://github.com/apache/iceberg/pull/13941
On Tue, Sep 2, 2025 at 10:02 AM Steven Wu <stevenz...@gmail.com> wrote: > sorry, the PR link for the staging-binaries.sh was wrong (missing a digit). > > I thought this PR will fix the issue. Initially, it worked well with a few > runs. But later I am still experiencing the same problem. Suggestions are > appreciated! > https://github.com/apache/iceberg/pull/13958 > > On Tue, Sep 2, 2025 at 9:51 AM Steven Wu <stevenz...@gmail.com> wrote: > >> Hi, >> >> Just to update the community on the status. >> >> Fokko also reached out to include Parquet Java 1.16.0 in this release. >> Vote just passed in the Parquet community. We are waiting for the binary >> release. We will try to include it in the 1.10.0 release. Reviews are >> welcomed. >> https://github.com/apache/iceberg/pull/1394 >> >> We also ran into a couple of issues with the release script/process. >> >> 1) staging-binaries.sh has race conditions on concurrent publish and 2 >> folders in Maven repo. >> >> I thought this PR will fix the issue. Initially, it worked well with a >> few runs. But later I am still experiencing the same problem. Suggestions >> are appreciated! >> https://github.com/apache/iceberg/pull/13958 >> >> 2) Yuya found out that the iceberg-api module wasn't published in the RC2 >> staging (1243). >> https://repository.apache.org/content/repositories/orgapacheiceberg-1243/ >> >> The first release issue is the more annoying/impacting problem. the >> second release issue is uncommon, as I didn't see it in a few other runs of >> staging-binaries.sh. >> >> Thanks, >> Steven >> >> >> >> On Sun, Aug 31, 2025 at 12:48 PM Steven Wu <stevenz...@gmail.com> wrote: >> >>> I started a vote thread for 1.10.0 RC2. >>> >>> I have to fix a couple of release script issues. Hence the first release >>> candidate is RC2 to vote. >>> >>> On Fri, Aug 29, 2025 at 9:53 AM Kevin Liu <kevinjq...@apache.org> wrote: >>> >>>> Thanks Steven! I did another pass to check for feature parity between >>>> spark 3.5 and spark 4.0 for this release and everything looks good. There >>>> are a few test cases that have not been ported, but we can punt those for >>>> now. >>>> >>>> Best, >>>> Kevin Liu >>>> >>>> On Thu, Aug 28, 2025 at 7:08 PM Steven Wu <stevenz...@gmail.com> wrote: >>>> >>>>> Thanks to Fokko and Ryan, the unknown type support PR was merged today. >>>>> >>>>> Everything in the 1.10.0 milestone is closed now. >>>>> >>>>> I will work on a release candidate next. >>>>> >>>>> On Fri, Aug 8, 2025 at 6:14 AM Fokko Driesprong <fo...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi Steven, >>>>>> >>>>>> Thanks for updating this thread. >>>>>> >>>>>> I've updated the UnknownType PR >>>>>> <https://github.com/apache/iceberg/pull/13445> to first block on the >>>>>> complex cases that will require some more discussion. This way we can >>>>>> revisit this also after the 1.10.0 release. >>>>>> >>>>>> Kind regards, >>>>>> Fokko >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Op do 7 aug 2025 om 23:56 schreef Steven Wu <stevenz...@gmail.com>: >>>>>> >>>>>>> edited the subject line as we are into August. >>>>>>> >>>>>>> We are still waiting for the following two changes for the 1.10.0 >>>>>>> release >>>>>>> * Anton's fix for the data frame join using the same snapshot, which >>>>>>> will introduce a slight behavior change in spark 4.0. >>>>>>> * unknown type support. >>>>>>> >>>>>>> >>>>>>> On Fri, Aug 1, 2025 at 6:56 AM Alexandre Dutra <adu...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Steven, >>>>>>>> >>>>>>>> A small regression with S3 signing has been reported to me. The fix >>>>>>>> is simple: >>>>>>>> >>>>>>>> https://github.com/apache/iceberg/pull/13718 >>>>>>>> >>>>>>>> Would it be still possible to have it in 1.10 please? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Alex >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jul 31, 2025 at 7:19 PM Steven Wu <stevenz...@gmail.com> >>>>>>>> wrote: >>>>>>>> > >>>>>>>> > Currently, the 1.10.0 milestone have no open PRs >>>>>>>> > https://github.com/apache/iceberg/milestone/54 >>>>>>>> > >>>>>>>> > The variant PR was merged this and last week. There are still >>>>>>>> some variant testing related PRs, which are probably not blockers for >>>>>>>> 1.10.0 release. >>>>>>>> > * Spark variant read: >>>>>>>> https://github.com/apache/iceberg/pull/13219 >>>>>>>> > * use short strings: https://github.com/apache/iceberg/pull/13284 >>>>>>>> > >>>>>>>> > We are still waiting for the following two changes >>>>>>>> > * Anton's fix for the data frame join using the same snapshot, >>>>>>>> which will introduce a slight behavior change in spark 4.0. >>>>>>>> > * unknown type support. Fokko raised a discussion thread on a >>>>>>>> blocking issue. >>>>>>>> > >>>>>>>> > Anything else did I miss? >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > On Sat, Jul 26, 2025 at 5:52 AM Fokko Driesprong < >>>>>>>> fo...@apache.org> wrote: >>>>>>>> >> >>>>>>>> >> Hey all, >>>>>>>> >> >>>>>>>> >> The read path for the UnknownType needs some community >>>>>>>> discussion. I've raised a separate thread. PTAL >>>>>>>> >> >>>>>>>> >> Kind regards from Belgium, >>>>>>>> >> Fokko >>>>>>>> >> >>>>>>>> >> Op za 26 jul 2025 om 00:58 schreef Ryan Blue <rdb...@gmail.com>: >>>>>>>> >>> >>>>>>>> >>> I thought that we said we wanted to get support out for v3 >>>>>>>> features in this release unless there is some reasonable blocker, like >>>>>>>> Spark not having geospatial types. To me, I think that means we should >>>>>>>> aim >>>>>>>> to get variant and unknown done so that we have a complete >>>>>>>> implementation >>>>>>>> with a major engine. And it should not be particularly difficult to get >>>>>>>> unknown done so I'd opt to get it in. >>>>>>>> >>> >>>>>>>> >>> On Fri, Jul 25, 2025 at 11:28 AM Steven Wu < >>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>> >>>> >>>>>>>> >>>> > I believe we also wanted to get in at least the read path >>>>>>>> for UnknownType. Fokko has a WIP PR for that. >>>>>>>> >>>> I thought in the community sync the consensus is that this is >>>>>>>> not a blocker, because it is a new feature implementation. If it is >>>>>>>> ready, >>>>>>>> it will be included. >>>>>>>> >>>> >>>>>>>> >>>> On Fri, Jul 25, 2025 at 9:43 AM Kevin Liu < >>>>>>>> kevinjq...@apache.org> wrote: >>>>>>>> >>>>> >>>>>>>> >>>>> I think Fokko's OOO. Should we help with that PR? >>>>>>>> >>>>> >>>>>>>> >>>>> On Fri, Jul 25, 2025 at 9:38 AM Eduard Tudenhöfner < >>>>>>>> etudenhoef...@apache.org> wrote: >>>>>>>> >>>>>> >>>>>>>> >>>>>> I believe we also wanted to get in at least the read path >>>>>>>> for UnknownType. Fokko has a WIP PR for that. >>>>>>>> >>>>>> >>>>>>>> >>>>>> On Fri, Jul 25, 2025 at 6:13 PM Steven Wu < >>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> 3. Spark: fix data frame join based on different versions >>>>>>>> of the same table that may lead to weird results. Anton is working on a >>>>>>>> fix. It requires a small behavior change (table state may be stale up >>>>>>>> to >>>>>>>> refresh interval). Hence it is better to include it in the 1.10.0 >>>>>>>> release >>>>>>>> where Spark 4.0 is first supported. >>>>>>>> >>>>>>> 4. Variant support in core and Spark 4.0. Ryan thinks this >>>>>>>> is very close and will prioritize the review. >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> We still have the above two issues pending. 3 doesn't have >>>>>>>> a PR yet. PR for 4 is not associated with the milestone yet. >>>>>>>> >>>>>>> >>>>>>>> >>>>>>> On Fri, Jul 25, 2025 at 9:02 AM Kevin Liu < >>>>>>>> kevinjq...@apache.org> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks everyone for the review. The 2 PRs are both merged. >>>>>>>> >>>>>>>> Looks like there's only 1 PR left in the 1.10 milestone :) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Kevin Liu >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jul 24, 2025 at 7:44 PM Manu Zhang < >>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>>> Thanks Kevin. The first change is not in the versioned >>>>>>>> doc so it can be released anytime. >>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>>> Regards, >>>>>>>> >>>>>>>>> Manu >>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>>> On Fri, Jul 25, 2025 at 4:21 AM Kevin Liu < >>>>>>>> kevinjq...@apache.org> wrote: >>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>>> The 3 PRs above are merged. Thanks everyone for the >>>>>>>> review. >>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>>> I've added 2 more PRs to the 1.10 milestone. These are >>>>>>>> both nice-to-haves. >>>>>>>> >>>>>>>>>> - docs: add subpage for REST Catalog Spec in >>>>>>>> "Specification" #13521 >>>>>>>> >>>>>>>>>> - REST-Fixture: Ensure strict mode on jdbc catalog for >>>>>>>> rest fixture #13599 >>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>>> The first one changes the link for "REST Catalog Spec" >>>>>>>> on the left nav of https://iceberg.apache.org/spec/ from the >>>>>>>> swagger.io link to a dedicated page for IRC. >>>>>>>> >>>>>>>>>> The second one fixes the default behavior of >>>>>>>> `iceberg-rest-fixture` image to align with the general expectation when >>>>>>>> creating a table in a catalog. >>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>>> Please take a look. I would like to have both of these >>>>>>>> as part of the 1.10 release. >>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>>> Best, >>>>>>>> >>>>>>>>>> Kevin Liu >>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>>>> On Wed, Jul 23, 2025 at 1:31 PM Kevin Liu < >>>>>>>> kevinjq...@apache.org> wrote: >>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> Here are the 3 PRs to add corresponding tests. >>>>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13648 >>>>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13649 >>>>>>>> >>>>>>>>>>> https://github.com/apache/iceberg/pull/13650 >>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> I've tagged them with the 1.10 milestone, waiting for >>>>>>>> CI to complete :) >>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> Best, >>>>>>>> >>>>>>>>>>> Kevin Liu >>>>>>>> >>>>>>>>>>> >>>>>>>> >>>>>>>>>>> On Wed, Jul 23, 2025 at 1:08 PM Steven Wu < >>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> Kevin, thanks for checking that. I will take a look at >>>>>>>> your backport PRs. Can you add them to the 1.10.0 milestone? >>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:27 PM Kevin Liu < >>>>>>>> kevinjq...@apache.org> wrote: >>>>>>>> >>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>> Thanks again for driving this Steven! We're very >>>>>>>> close!! >>>>>>>> >>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>> As mentioned in the community sync today, I wanted to >>>>>>>> verify feature parity between Spark 3.5 and Spark 4.0 for this release. >>>>>>>> >>>>>>>>>>>>> I was able to verify that Spark 3.5 and Spark 4.0 >>>>>>>> have feature parity for this upcoming release. More details in the >>>>>>>> other >>>>>>>> devlist thread >>>>>>>> https://lists.apache.org/thread/7x7xcm3y87y81c4grq4nn9gdjd4jm05f >>>>>>>> >>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>> >>>>>>>>>>>>> Kevin Liu >>>>>>>> >>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>> On Wed, Jul 23, 2025 at 12:17 PM Steven Wu < >>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> Another update on the release. >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> The existing blocker PRs are almost done. >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> During today's community sync, we identified the >>>>>>>> following issues/PRs to be included in the 1.10.0 release. >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> backport of PR 13100 to the main branch. I have >>>>>>>> created a cherry-pick PR for that. There is a one line difference >>>>>>>> compared >>>>>>>> to the original PR due to the removal of the deprecated RemoveSnapshot >>>>>>>> class in main branch for 1.10.0 target. Amogh has suggested using >>>>>>>> RemoveSnapshots with a single snapshot id, which should be supported >>>>>>>> by all >>>>>>>> REST catalog servers. >>>>>>>> >>>>>>>>>>>>>> Flink compaction doesn't support row lineage. Fail >>>>>>>> the compaction for V3 tables. I created a PR for that. Will backport >>>>>>>> after >>>>>>>> it is merged. >>>>>>>> >>>>>>>>>>>>>> Spark: fix data frame join based on different >>>>>>>> versions of the same table that may lead to weird results. Anton is >>>>>>>> working >>>>>>>> on a fix. It requires a small behavior change (table state may be >>>>>>>> stale up >>>>>>>> to refresh interval). Hence it is better to include it in the 1.10.0 >>>>>>>> release where Spark 4.0 is first supported. >>>>>>>> >>>>>>>>>>>>>> Variant support in core and Spark 4.0. Ryan thinks >>>>>>>> this is very close and will prioritize the review. >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>> >>>>>>>>>>>>>> steven >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> The 1.10.0 milestone can be found here. >>>>>>>> >>>>>>>>>>>>>> https://github.com/apache/iceberg/milestone/54 >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 9:15 AM Steven Wu < >>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>> Ajantha/Robin, thanks for the note. We can include >>>>>>>> the PR in the 1.10.0 milestone. >>>>>>>> >>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>> On Wed, Jul 16, 2025 at 3:20 AM Robin Moffatt >>>>>>>> <ro...@confluent.io.invalid> wrote: >>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>> Thanks Ajantha. Just to confirm, from a Confluent >>>>>>>> point of view, we will not be able to publish the connector on >>>>>>>> Confluent >>>>>>>> Hub until this CVE[1] is fixed. >>>>>>>> >>>>>>>>>>>>>>>> Since we would not publish a snapshot build, if >>>>>>>> the fix doesn't make it into 1.10 then we'd have to wait for 1.11 (or >>>>>>>> a dot >>>>>>>> release of 1.10) to be able to include the connector on Confluent Hub. >>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>> Thanks, Robin. >>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>> [1] >>>>>>>> https://github.com/apache/iceberg/issues/10745#issuecomment-3074300861 >>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>> On Wed, 16 Jul 2025 at 04:03, Ajantha Bhat < >>>>>>>> ajanthab...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>> I have approached Confluent people to help us >>>>>>>> publish the OSS Kafka Connect Iceberg sink plugin. >>>>>>>> >>>>>>>>>>>>>>>>> It seems we have a CVE from dependency that >>>>>>>> blocks us from publishing the plugin. >>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>> Please include the below PR for 1.10.0 release >>>>>>>> which fixes that. >>>>>>>> >>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/13561 >>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>> - Ajantha >>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Jul 15, 2025 at 10:48 AM Steven Wu < >>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>> > Engines may model operations as >>>>>>>> deleting/inserting rows or as modifications to rows that preserve row >>>>>>>> ids. >>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>> Manu, I agree this sentence probably lacks some >>>>>>>> context. The first half (as deleting/inserting rows) is probably about >>>>>>>> the >>>>>>>> row lineage handling with equality deletes, which is described in >>>>>>>> another >>>>>>>> place. >>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>> "Row lineage does not track lineage for rows >>>>>>>> updated via Equality Deletes, because engines using equality deletes >>>>>>>> avoid >>>>>>>> reading existing data before writing changes and can't provide the >>>>>>>> original >>>>>>>> row ID for the new rows. These updates are always treated as if the >>>>>>>> existing row was completely removed and a unique new row was added." >>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 5:49 PM Manu Zhang < >>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks Steven, I missed that part but the >>>>>>>> following sentence is a bit hard to understand (maybe just me) >>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>> Engines may model operations as >>>>>>>> deleting/inserting rows or as modifications to rows that preserve row >>>>>>>> ids. >>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>> Can you please help to explain? >>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>> Steven Wu <stevenz...@gmail.com>于2025年7月15日 >>>>>>>> 周二04:41写道: >>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>> Manu >>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>> The spec already covers the row lineage carry >>>>>>>> over (for replace) >>>>>>>> >>>>>>>>>>>>>>>>>>>> https://iceberg.apache.org/spec/#row-lineage >>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>> "When an existing row is moved to a different >>>>>>>> data file for any reason, writers should write _row_id and >>>>>>>> _last_updated_sequence_number according to the following rules:" >>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>> >>>>>>>>>>>>>>>>>>>> Steven >>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>> On Mon, Jul 14, 2025 at 1:38 PM Steven Wu < >>>>>>>> stevenz...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> another update on the release. >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> We have one open PR left for the 1.10.0 >>>>>>>> milestone (with 25 closed PRs). Amogh is actively working on the last >>>>>>>> blocker PR. >>>>>>>> >>>>>>>>>>>>>>>>>>>>> Spark 4.0: Preserve row lineage information >>>>>>>> on compaction >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> I will publish a release candidate after the >>>>>>>> above blocker is merged and backported. >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>> >>>>>>>>>>>>>>>>>>>>> Steven >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 7, 2025 at 11:56 PM Manu Zhang < >>>>>>>> owenzhang1...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi Amogh, >>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Is it defined in the table spec that >>>>>>>> "replace" operation should carry over existing lineage info insteading >>>>>>>> of >>>>>>>> assigning new IDs? If not, we'd better firstly define it in spec >>>>>>>> because >>>>>>>> all engines and implementations need to follow it. >>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Tue, Jul 8, 2025 at 11:44 AM Amogh >>>>>>>> Jahagirdar <2am...@gmail.com> wrote: >>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> One other area I think we need to make sure >>>>>>>> works with row lineage before release is data file compaction. At the >>>>>>>> moment, it looks like compaction will read the records from the data >>>>>>>> files >>>>>>>> without projecting the lineage fields. What this means is that on >>>>>>>> write of >>>>>>>> the new compacted data files we'd be losing the lineage information. >>>>>>>> There's no data change in a compaction but we do need to make sure the >>>>>>>> lineage info from carried over records is materialized in the newly >>>>>>>> compacted files so they don't get new IDs or inherit the new file >>>>>>>> sequence >>>>>>>> number. I'm working on addressing this as well, but I'd call this out >>>>>>>> as a >>>>>>>> blocker as well. >>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>> >>>>>>>>>>>>>>>> Robin Moffatt >>>>>>>> >>>>>>>>>>>>>>>> Sr. Principal Advisor, Streaming Data Technologies >>>>>>>> >>>>>>>