I recommend not adding this restriction to the spec for two reasons. 1. When a row is restored, it is desirable that the row ID is restored as well. So this sounds like a feature and not a bug to me. 2. Table RESTOREs are a commonly used feature, and it would be prohibitively expensive to rewrite data files to restore deleted rows back into the table.
Best, Anoop On Tue, Dec 2, 2025 at 4:56 PM Szehon Ho <[email protected]> wrote: > Szehon, I didn't quite understand this question. Can you elaborate a bit? > > > Yea I was wondering in the scenario you are discussing above, a new file: > >> >> - whose persisted row-id value is lower than the snapshot's >> first-row-id >> >> >> - whose last-updated-seq-number is not set and inherit from the >> snapshot sequence number >> >> I saw your interpretation though that it is not explicitly allowed. > > Overall, I was just trying to reason wondering whether its beneficial to > disallow a quick un-delete in the scenario that you describe, due to the > difficulty of implementing the row-lineage and other things, as the > scenario is not really a violation of the current row-lineage spec as > initially stated, but definitely troublesome. > > Thanks, > Szehon > > On Tue, Dec 2, 2025 at 2:52 PM Steven Wu <[email protected]> wrote: > >> Let's look at the following scenario >> >> * Snapshot 10 (first-row-id: 100) >> - A new data file was added and it contains row X. Row X inherits >> row-id as 105 and last-updated-sequence-number as 10 >> * Snapshot 11 (first-row-id: 200) >> - Row X was deleted via DV >> * Snapshot 12 (first-row-id: 300) >> - Row X was restored (added back) by rewriting DV and with the delete >> position unset. >> >> When querying the table after snapshot 12, the Row X would have the >> row-id as 105 and last-updated-sequence-number as 10 (just as the initial >> add at snapshot 10). The correct last-updated-sequence-number should be 12 >> and row-id should be >=300 for added/restored row X. >> >> Hence, we are proposing that it is invalid to restore a row by rewriting >> the DV or position delete file and unsetting the delete position. >> >> > But if a data file has all rows that have 'row-id' set and >> 'last_updated_sequence_number' unset, technically this can be a valid >> undelete, is it right? >> >> Szehon, I didn't quite understand this question. Can you elaborate a bit? >> >> >> >> >> On Tue, Dec 2, 2025 at 2:12 PM Szehon Ho <[email protected]> wrote: >> >>> Hi, >>> >>> Sorry, I re-read the thread and Peter's question more closely, and >>> wanted to explore that we are not precluding something unnecessarily, and >>> if we can solve the code problem in other ways. >>> >>> The concern is that in the 'undeleted' row, the row_id and >>> last_updated_seq_number are wrong. >>> >>> - If 'row-id' is not set, it inherits a row-id that is changed, >>> which is wrong >>> - If 'last_updated_sequence_number' is set, then it is wrong because >>> it should refer to the snapshot that 'undeleted it'. >>> >>> Is that correct? >>> >>> But if a data file has all rows that have 'row-id' set and >>> 'last_updated_sequence_number' unset, technically this can be a valid >>> undelete, is it right? >>> >>> Thanks >>> Szehon >>> >>> On Mon, Dec 1, 2025 at 11:08 AM Steven Wu <[email protected]> wrote: >>> >>>> >>>> > _row_id a unique long identifier for every row within the table. The >>>> value is assigned via inheritance when a row is first added to the table. >>>> >>>> Actually, current spec doesn't allow explicitly assigning row-id for >>>> new rows. >>>> >>>> So currently we don't need to worry about the question if it is allowed >>>> to have *new* rows with explicitly assigned row-id values lower than >>>> the snapshot's first-row-id. >>>> >>>> On Mon, Dec 1, 2025 at 9:50 AM Steven Wu <[email protected]> wrote: >>>> >>>>> Here is the spec PR to clarify undelete is not allowed. Will start a >>>>> vote thread for that. >>>>> https://github.com/apache/iceberg/pull/14731 >>>>> >>>>> Let me start a new discussion thread for the first-row-id and row-id >>>>> question for row lineage to get more attention and input. >>>>> >>>>> On Sat, Nov 22, 2025 at 7:02 AM Péter Váry < >>>>> [email protected]> wrote: >>>>> >>>>>> Apologies if I was unclear. As Steven also mentioned, I wanted to >>>>>> confirm whether we agree on the clarification regarding the `row-id` and >>>>>> `first-row-id`. >>>>>> >>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. nov. 22., >>>>>> Szo, 15:28): >>>>>> >>>>>>> Just to clarify, I was asking a question. >>>>>>> >>>>>>> Is it valid to add a new data file with a row? >>>>>>> >>>>>>> - whose persisted row-id value is lower than the snapshot's >>>>>>> first-row-id >>>>>>> - whose last-updated-seq-number is not set and inherit from the >>>>>>> snapshot sequence number >>>>>>> >>>>>>> Thanks, >>>>>>> Steven >>>>>>> >>>>>>> On Fri, Nov 21, 2025 at 11:25 PM Péter Váry < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> +1 for this proposal >>>>>>>> >>>>>>>> Slightly related, but we can move this to a separate thread if it >>>>>>>> needs independent discussion: We should clarify the relationship >>>>>>>> between >>>>>>>> `row-id` and `first-row-id`. This has come up several times in our >>>>>>>> discussions about the equality delete removal proposal, where we >>>>>>>> considered >>>>>>>> generating `row-ids` manually instead of relying on the auto-assignment >>>>>>>> feature. >>>>>>>> >>>>>>>> As discussed with Steven: >>>>>>>> >>>>>>>>> It is valid to add a new data file with a row: >>>>>>>>> >>>>>>>>> - whose persisted row-id value is lower than the snapshot's >>>>>>>>> first-row-id >>>>>>>>> - whose last-updated-seq-number is not set and inherit from >>>>>>>>> the snapshot sequence number >>>>>>>>> >>>>>>>>> >>>>>>>> Prashant Singh <[email protected]> ezt írta (időpont: 2025. >>>>>>>> nov. 22., Szo, 5:29): >>>>>>>> >>>>>>>>> +1 for making it explicit that an *undelete *of a row can't be >>>>>>>>> done by unsetting the corresponding bit in DV >>>>>>>>> >>>>>>>>> *Rows should only be added via new data files*, sounds reasonable >>>>>>>>> to me ! >>>>>>>>> >>>>>>>>> apart from row-lineage it also complicates the operation type >>>>>>>>> inference like here [1] as we would now >>>>>>>>> inspect the contents of these DV to see if it's an insert ? >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> https://github.com/apache/iceberg/pull/14581#discussion_r2533057189 >>>>>>>>> >>>>>>>>> On Sat, Nov 22, 2025 at 4:48 AM Szehon Ho <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> It makes sense to me, it sounds like a minor clarification. For >>>>>>>>>> v2 position deletes, code like rewrite_position_deletes may have >>>>>>>>>> made some >>>>>>>>>> assumptions like this and would not work well if violated, maybe >>>>>>>>>> other code >>>>>>>>>> as well. >>>>>>>>>> >>>>>>>>>> Thanks >>>>>>>>>> Szehon >>>>>>>>>> >>>>>>>>>> On Fri, Nov 21, 2025 at 3:03 PM Steven Wu <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Similar weird behavior can also happen for V2 position delete >>>>>>>>>>> files with `undelete`. >>>>>>>>>>> >>>>>>>>>>> In V2, there could be multiple position delete files (say pd1, >>>>>>>>>>> pd2) associated with the same data file (say f1). Let's say pd1 >>>>>>>>>>> deletes row >>>>>>>>>>> 5 and 10 and pd2 deletes row 15. >>>>>>>>>>> 1. a new snapshot is committed with pd1 (DELETED), pd2 >>>>>>>>>>> (EXISTING), and pd3 (ADDED). pd3 deletes only row 10 (undeleted row >>>>>>>>>>> 5) >>>>>>>>>>> 2. a new snapshot is committed with pd1 (DELETED) and pd2 >>>>>>>>>>> (EXISTING) >>>>>>>>>>> >>>>>>>>>>> In either case, essentially some rows are added (back) to the >>>>>>>>>>> table with lower sequence number than the new snapshot's sequence >>>>>>>>>>> number. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Just to recap the question: should the spec (v2 and v3) spell >>>>>>>>>>> out that `undelete row` is not allowed? Rows should only be added >>>>>>>>>>> via new >>>>>>>>>>> data files. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, Nov 21, 2025 at 1:09 PM Steven Wu <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> >Are we specifically stating somewhere that all row-ids should >>>>>>>>>>>> be higher than or equal to the snapshot's `first-row-id`? >>>>>>>>>>>> In my mental model the `first-row-id` is only applicable for >>>>>>>>>>>> rows that don't have a specific row-id assigned. >>>>>>>>>>>> >>>>>>>>>>>> I meant an ADDED row should have `row-id` higher than or equal >>>>>>>>>>>> to the snapshot's `first-row-id`. EXISTING or UPDATED row can have >>>>>>>>>>>> lower >>>>>>>>>>>> row id. >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Nov 21, 2025 at 1:04 PM Steven Wu <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> > Can we create a validator to prevent this from happening? >>>>>>>>>>>>> >>>>>>>>>>>>> We don't have this problem with the Java implementation. >>>>>>>>>>>>> `BaseDVFileWriter` merges the previous DV with the new delta DV. >>>>>>>>>>>>> So there >>>>>>>>>>>>> is no `undelete` behavior. I am not aware of any Java API to allow >>>>>>>>>>>>> "undelete". So we probably don't need to add any validation code >>>>>>>>>>>>> in the >>>>>>>>>>>>> Java impl. >>>>>>>>>>>>> >>>>>>>>>>>>> Just thought it is good to spell it out in the spec so that >>>>>>>>>>>>> clients/engines can be clear about the expected behavior. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Nov 21, 2025 at 12:18 PM Péter Váry < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Are we specifically stating somewhere that all row-ids should >>>>>>>>>>>>>> be higher than or equal to the snapshot's `first-row-id`? >>>>>>>>>>>>>> In my mental model the `first-row-id` is only applicable for >>>>>>>>>>>>>> rows that don't have a specific row-id assigned. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Noneless, I agree that the `row-id` and the >>>>>>>>>>>>>> `last-updated-seq-num` should have changed to a new one, so we >>>>>>>>>>>>>> can say that >>>>>>>>>>>>>> undeleting a row is not allowed because of this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can we create a validator to prevent this from happening? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. >>>>>>>>>>>>>> nov. 21., P, 21:11): >>>>>>>>>>>>>> >>>>>>>>>>>>>>> The undeleted row would have invalid `row-id` and >>>>>>>>>>>>>>> `last-updated-seq-num`. Since it is a new row (added back), it >>>>>>>>>>>>>>> should have >>>>>>>>>>>>>>> the `row-id` higher than or equal to the snapshot's >>>>>>>>>>>>>>> `first-row-id` and the >>>>>>>>>>>>>>> `last-updated-seq-number` should inherit/have the new >>>>>>>>>>>>>>> snapshot's sequence >>>>>>>>>>>>>>> number. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Nov 21, 2025 at 11:48 AM Steven Wu < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Should we clarify the V3 spec to explicitly formid " >>>>>>>>>>>>>>>> *undelete*" of a row by unsetting the DV bit? Unsetting a >>>>>>>>>>>>>>>> DV bit essentially adds a row with lower row-id than the >>>>>>>>>>>>>>>> snapshot's >>>>>>>>>>>>>>>> first-row-id, which would violate the row lineage spec. With >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> restriction, DV cardinality should be monotonically increasing. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>
