> 1. When a row is restored, it is desirable that the row ID is restored as well. So this sounds like a feature and not a bug to me.
Is “restored” even a valid concept? Most systems only support adding and deleting rows; restoration is typically treated as inserting a new row, which should receive a new ID. Even if we accept the notion of restoration, I agree with Steven’s point: the last-updated-seq-number should reflect the time of restoration. Simply removing the delete flag from the original row would be an invalid operation. > 2. Table RESTOREs are a commonly used feature, and it would be prohibitively expensive to rewrite data files to restore deleted rows back into the table. That's an interesting point, but I think in this specific case, most users would likely tolerate lineage corruption. For those who cannot, a full table rewrite remains an option. Anoop Johnson <[email protected]> ezt írta (időpont: 2025. dec. 3., Sze, 8:26): > I recommend not adding this restriction to the spec for two reasons. > > 1. When a row is restored, it is desirable that the row ID is restored as > well. So this sounds like a feature and not a bug to me. > 2. Table RESTOREs are a commonly used feature, and it would be > prohibitively expensive to rewrite data files to restore deleted rows back > into the table. > > Best, > Anoop > > On Tue, Dec 2, 2025 at 4:56 PM Szehon Ho <[email protected]> wrote: > >> Szehon, I didn't quite understand this question. Can you elaborate a bit? >> >> >> Yea I was wondering in the scenario you are discussing above, a new file: >> >>> >>> - whose persisted row-id value is lower than the snapshot's >>> first-row-id >>> >>> >>> - whose last-updated-seq-number is not set and inherit from the >>> snapshot sequence number >>> >>> I saw your interpretation though that it is not explicitly allowed. >> >> Overall, I was just trying to reason wondering whether its beneficial to >> disallow a quick un-delete in the scenario that you describe, due to the >> difficulty of implementing the row-lineage and other things, as the >> scenario is not really a violation of the current row-lineage spec as >> initially stated, but definitely troublesome. >> >> Thanks, >> Szehon >> >> On Tue, Dec 2, 2025 at 2:52 PM Steven Wu <[email protected]> wrote: >> >>> Let's look at the following scenario >>> >>> * Snapshot 10 (first-row-id: 100) >>> - A new data file was added and it contains row X. Row X inherits >>> row-id as 105 and last-updated-sequence-number as 10 >>> * Snapshot 11 (first-row-id: 200) >>> - Row X was deleted via DV >>> * Snapshot 12 (first-row-id: 300) >>> - Row X was restored (added back) by rewriting DV and with the delete >>> position unset. >>> >>> When querying the table after snapshot 12, the Row X would have the >>> row-id as 105 and last-updated-sequence-number as 10 (just as the initial >>> add at snapshot 10). The correct last-updated-sequence-number should be 12 >>> and row-id should be >=300 for added/restored row X. >>> >>> Hence, we are proposing that it is invalid to restore a row by rewriting >>> the DV or position delete file and unsetting the delete position. >>> >>> > But if a data file has all rows that have 'row-id' set and >>> 'last_updated_sequence_number' unset, technically this can be a valid >>> undelete, is it right? >>> >>> Szehon, I didn't quite understand this question. Can you elaborate a bit? >>> >>> >>> >>> >>> On Tue, Dec 2, 2025 at 2:12 PM Szehon Ho <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> Sorry, I re-read the thread and Peter's question more closely, and >>>> wanted to explore that we are not precluding something unnecessarily, and >>>> if we can solve the code problem in other ways. >>>> >>>> The concern is that in the 'undeleted' row, the row_id and >>>> last_updated_seq_number are wrong. >>>> >>>> - If 'row-id' is not set, it inherits a row-id that is changed, >>>> which is wrong >>>> - If 'last_updated_sequence_number' is set, then it is wrong >>>> because it should refer to the snapshot that 'undeleted it'. >>>> >>>> Is that correct? >>>> >>>> But if a data file has all rows that have 'row-id' set and >>>> 'last_updated_sequence_number' unset, technically this can be a valid >>>> undelete, is it right? >>>> >>>> Thanks >>>> Szehon >>>> >>>> On Mon, Dec 1, 2025 at 11:08 AM Steven Wu <[email protected]> wrote: >>>> >>>>> >>>>> > _row_id a unique long identifier for every row within the table. >>>>> The value is assigned via inheritance when a row is first added to the >>>>> table. >>>>> >>>>> Actually, current spec doesn't allow explicitly assigning row-id for >>>>> new rows. >>>>> >>>>> So currently we don't need to worry about the question if it is >>>>> allowed to have *new* rows with explicitly assigned row-id values >>>>> lower than the snapshot's first-row-id. >>>>> >>>>> On Mon, Dec 1, 2025 at 9:50 AM Steven Wu <[email protected]> wrote: >>>>> >>>>>> Here is the spec PR to clarify undelete is not allowed. Will start a >>>>>> vote thread for that. >>>>>> https://github.com/apache/iceberg/pull/14731 >>>>>> >>>>>> Let me start a new discussion thread for the first-row-id and row-id >>>>>> question for row lineage to get more attention and input. >>>>>> >>>>>> On Sat, Nov 22, 2025 at 7:02 AM Péter Váry < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Apologies if I was unclear. As Steven also mentioned, I wanted to >>>>>>> confirm whether we agree on the clarification regarding the `row-id` and >>>>>>> `first-row-id`. >>>>>>> >>>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. nov. 22., >>>>>>> Szo, 15:28): >>>>>>> >>>>>>>> Just to clarify, I was asking a question. >>>>>>>> >>>>>>>> Is it valid to add a new data file with a row? >>>>>>>> >>>>>>>> - whose persisted row-id value is lower than the snapshot's >>>>>>>> first-row-id >>>>>>>> - whose last-updated-seq-number is not set and inherit from the >>>>>>>> snapshot sequence number >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Steven >>>>>>>> >>>>>>>> On Fri, Nov 21, 2025 at 11:25 PM Péter Váry < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> +1 for this proposal >>>>>>>>> >>>>>>>>> Slightly related, but we can move this to a separate thread if it >>>>>>>>> needs independent discussion: We should clarify the relationship >>>>>>>>> between >>>>>>>>> `row-id` and `first-row-id`. This has come up several times in our >>>>>>>>> discussions about the equality delete removal proposal, where we >>>>>>>>> considered >>>>>>>>> generating `row-ids` manually instead of relying on the >>>>>>>>> auto-assignment >>>>>>>>> feature. >>>>>>>>> >>>>>>>>> As discussed with Steven: >>>>>>>>> >>>>>>>>>> It is valid to add a new data file with a row: >>>>>>>>>> >>>>>>>>>> - whose persisted row-id value is lower than the snapshot's >>>>>>>>>> first-row-id >>>>>>>>>> - whose last-updated-seq-number is not set and inherit from >>>>>>>>>> the snapshot sequence number >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Prashant Singh <[email protected]> ezt írta (időpont: >>>>>>>>> 2025. nov. 22., Szo, 5:29): >>>>>>>>> >>>>>>>>>> +1 for making it explicit that an *undelete *of a row can't be >>>>>>>>>> done by unsetting the corresponding bit in DV >>>>>>>>>> >>>>>>>>>> *Rows should only be added via new data files*, sounds >>>>>>>>>> reasonable to me ! >>>>>>>>>> >>>>>>>>>> apart from row-lineage it also complicates the operation type >>>>>>>>>> inference like here [1] as we would now >>>>>>>>>> inspect the contents of these DV to see if it's an insert ? >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> https://github.com/apache/iceberg/pull/14581#discussion_r2533057189 >>>>>>>>>> >>>>>>>>>> On Sat, Nov 22, 2025 at 4:48 AM Szehon Ho < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> It makes sense to me, it sounds like a minor clarification. For >>>>>>>>>>> v2 position deletes, code like rewrite_position_deletes may have >>>>>>>>>>> made some >>>>>>>>>>> assumptions like this and would not work well if violated, maybe >>>>>>>>>>> other code >>>>>>>>>>> as well. >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Szehon >>>>>>>>>>> >>>>>>>>>>> On Fri, Nov 21, 2025 at 3:03 PM Steven Wu <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Similar weird behavior can also happen for V2 position delete >>>>>>>>>>>> files with `undelete`. >>>>>>>>>>>> >>>>>>>>>>>> In V2, there could be multiple position delete files (say pd1, >>>>>>>>>>>> pd2) associated with the same data file (say f1). Let's say pd1 >>>>>>>>>>>> deletes row >>>>>>>>>>>> 5 and 10 and pd2 deletes row 15. >>>>>>>>>>>> 1. a new snapshot is committed with pd1 (DELETED), pd2 >>>>>>>>>>>> (EXISTING), and pd3 (ADDED). pd3 deletes only row 10 (undeleted >>>>>>>>>>>> row 5) >>>>>>>>>>>> 2. a new snapshot is committed with pd1 (DELETED) and pd2 >>>>>>>>>>>> (EXISTING) >>>>>>>>>>>> >>>>>>>>>>>> In either case, essentially some rows are added (back) to the >>>>>>>>>>>> table with lower sequence number than the new snapshot's sequence >>>>>>>>>>>> number. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Just to recap the question: should the spec (v2 and v3) spell >>>>>>>>>>>> out that `undelete row` is not allowed? Rows should only be added >>>>>>>>>>>> via new >>>>>>>>>>>> data files. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Nov 21, 2025 at 1:09 PM Steven Wu <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >Are we specifically stating somewhere that all row-ids should >>>>>>>>>>>>> be higher than or equal to the snapshot's `first-row-id`? >>>>>>>>>>>>> In my mental model the `first-row-id` is only applicable for >>>>>>>>>>>>> rows that don't have a specific row-id assigned. >>>>>>>>>>>>> >>>>>>>>>>>>> I meant an ADDED row should have `row-id` higher than or equal >>>>>>>>>>>>> to the snapshot's `first-row-id`. EXISTING or UPDATED row can >>>>>>>>>>>>> have lower >>>>>>>>>>>>> row id. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Nov 21, 2025 at 1:04 PM Steven Wu < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> > Can we create a validator to prevent this from happening? >>>>>>>>>>>>>> >>>>>>>>>>>>>> We don't have this problem with the Java implementation. >>>>>>>>>>>>>> `BaseDVFileWriter` merges the previous DV with the new delta >>>>>>>>>>>>>> DV. So there >>>>>>>>>>>>>> is no `undelete` behavior. I am not aware of any Java API to >>>>>>>>>>>>>> allow >>>>>>>>>>>>>> "undelete". So we probably don't need to add any validation code >>>>>>>>>>>>>> in the >>>>>>>>>>>>>> Java impl. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Just thought it is good to spell it out in the spec so that >>>>>>>>>>>>>> clients/engines can be clear about the expected behavior. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Nov 21, 2025 at 12:18 PM Péter Váry < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Are we specifically stating somewhere that all row-ids >>>>>>>>>>>>>>> should be higher than or equal to the snapshot's `first-row-id`? >>>>>>>>>>>>>>> In my mental model the `first-row-id` is only applicable for >>>>>>>>>>>>>>> rows that don't have a specific row-id assigned. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Noneless, I agree that the `row-id` and the >>>>>>>>>>>>>>> `last-updated-seq-num` should have changed to a new one, so we >>>>>>>>>>>>>>> can say that >>>>>>>>>>>>>>> undeleting a row is not allowed because of this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can we create a validator to prevent this from happening? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. >>>>>>>>>>>>>>> nov. 21., P, 21:11): >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The undeleted row would have invalid `row-id` and >>>>>>>>>>>>>>>> `last-updated-seq-num`. Since it is a new row (added back), it >>>>>>>>>>>>>>>> should have >>>>>>>>>>>>>>>> the `row-id` higher than or equal to the snapshot's >>>>>>>>>>>>>>>> `first-row-id` and the >>>>>>>>>>>>>>>> `last-updated-seq-number` should inherit/have the new >>>>>>>>>>>>>>>> snapshot's sequence >>>>>>>>>>>>>>>> number. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Nov 21, 2025 at 11:48 AM Steven Wu < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Should we clarify the V3 spec to explicitly formid " >>>>>>>>>>>>>>>>> *undelete*" of a row by unsetting the DV bit? Unsetting a >>>>>>>>>>>>>>>>> DV bit essentially adds a row with lower row-id than the >>>>>>>>>>>>>>>>> snapshot's >>>>>>>>>>>>>>>>> first-row-id, which would violate the row lineage spec. With >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> restriction, DV cardinality should be monotonically >>>>>>>>>>>>>>>>> increasing. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>
