Re: [DISCUSS] V3 spec: add monotonic requirement to data DV

Szehon Ho Tue, 02 Dec 2025 16:55:58 -0800

>
> Szehon, I didn't quite understand this question. Can you elaborate a bit?



Yea I was wondering in the scenario you are discussing above, a new file:

>
>    - whose persisted row-id value is lower than the snapshot's
>    first-row-id
>
>
>    - whose last-updated-seq-number is not set and inherit from the
>    snapshot sequence number
>
> I saw your interpretation though that it is not explicitly allowed.

Overall, I was just trying to reason wondering whether its beneficial to
disallow a quick un-delete in the scenario that you describe, due to the
difficulty of implementing the row-lineage and other things, as the
scenario is not really a violation of the current row-lineage spec as
initially stated, but definitely troublesome.

Thanks,
Szehon

On Tue, Dec 2, 2025 at 2:52 PM Steven Wu <[email protected]> wrote:

> Let's look at the following scenario
>
> * Snapshot 10 (first-row-id: 100)
>   - A new data file was added and it contains row X. Row X inherits row-id
> as 105 and last-updated-sequence-number as 10
> * Snapshot 11 (first-row-id: 200)
>   - Row X was deleted via DV
> * Snapshot 12 (first-row-id: 300)
>   - Row X was restored (added back) by rewriting DV and with the delete
> position unset.
>
> When querying the table after snapshot 12, the Row X would have the row-id
> as 105 and last-updated-sequence-number as 10 (just as the initial add at
> snapshot 10). The correct last-updated-sequence-number should be 12 and
> row-id should be >=300 for added/restored row X.
>
> Hence, we are proposing that it is invalid to restore a row by rewriting
> the DV or position delete file and unsetting the delete position.
>
> > But if a data file has all rows that have 'row-id' set and
> 'last_updated_sequence_number' unset, technically this can be a valid
> undelete, is it right?
>
> Szehon, I didn't quite understand this question. Can you elaborate a bit?
>
>
>
>
> On Tue, Dec 2, 2025 at 2:12 PM Szehon Ho <[email protected]> wrote:
>
>> Hi,
>>
>> Sorry, I re-read the thread and Peter's question more closely, and wanted
>> to explore that we are not precluding something unnecessarily, and if we
>> can solve the code problem in other ways.
>>
>> The concern is that in the 'undeleted' row, the row_id and
>> last_updated_seq_number are wrong.
>>
>>    - If 'row-id' is not set, it inherits a row-id that is changed, which
>>    is wrong
>>    - If 'last_updated_sequence_number' is set, then it is wrong because
>>    it should refer to the snapshot that 'undeleted it'.
>>
>> Is that correct?
>>
>> But if a data file has all rows that have 'row-id' set and
>> 'last_updated_sequence_number' unset, technically this can be a valid
>> undelete, is it right?
>>
>> Thanks
>> Szehon
>>
>> On Mon, Dec 1, 2025 at 11:08 AM Steven Wu <[email protected]> wrote:
>>
>>>
>>> > _row_id a unique long identifier for every row within the table. The
>>> value is assigned via inheritance when a row is first added to the table.
>>>
>>> Actually, current spec doesn't allow explicitly assigning row-id for new
>>> rows.
>>>
>>> So currently we don't need to worry about the question if it is allowed
>>> to have *new* rows with explicitly assigned row-id values lower than
>>> the snapshot's first-row-id.
>>>
>>> On Mon, Dec 1, 2025 at 9:50 AM Steven Wu <[email protected]> wrote:
>>>
>>>> Here is the spec PR to clarify undelete is not allowed. Will start a
>>>> vote thread for that.
>>>> https://github.com/apache/iceberg/pull/14731
>>>>
>>>> Let me start a new discussion thread for the first-row-id and row-id
>>>> question for row lineage to get more attention and input.
>>>>
>>>> On Sat, Nov 22, 2025 at 7:02 AM Péter Váry <[email protected]>
>>>> wrote:
>>>>
>>>>> Apologies if I was unclear. As Steven also mentioned, I wanted to
>>>>> confirm whether we agree on the clarification regarding the `row-id` and
>>>>> `first-row-id`.
>>>>>
>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. nov. 22.,
>>>>> Szo, 15:28):
>>>>>
>>>>>> Just to clarify, I was asking a question.
>>>>>>
>>>>>> Is it valid to add a new data file with a row?
>>>>>>
>>>>>>    - whose persisted row-id value is lower than the snapshot's
>>>>>>    first-row-id
>>>>>>    - whose last-updated-seq-number is not set and inherit from the
>>>>>>    snapshot sequence number
>>>>>>
>>>>>> Thanks,
>>>>>> Steven
>>>>>>
>>>>>> On Fri, Nov 21, 2025 at 11:25 PM Péter Váry <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> +1 for this proposal
>>>>>>>
>>>>>>> Slightly related, but we can move this to a separate thread if it
>>>>>>> needs independent discussion: We should clarify the relationship between
>>>>>>> `row-id` and `first-row-id`. This has come up several times in our
>>>>>>> discussions about the equality delete removal proposal, where we 
>>>>>>> considered
>>>>>>> generating `row-ids` manually instead of relying on the auto-assignment
>>>>>>> feature.
>>>>>>>
>>>>>>> As discussed with Steven:
>>>>>>>
>>>>>>>> It is valid to add a new data file with a row:
>>>>>>>>
>>>>>>>>    - whose persisted row-id value is lower than the snapshot's
>>>>>>>>    first-row-id
>>>>>>>>    - whose last-updated-seq-number is not set and inherit from the
>>>>>>>>    snapshot sequence number
>>>>>>>>
>>>>>>>>
>>>>>>> Prashant Singh <[email protected]> ezt írta (időpont: 2025.
>>>>>>> nov. 22., Szo, 5:29):
>>>>>>>
>>>>>>>> +1 for making it explicit that an *undelete *of a row can't be
>>>>>>>> done by unsetting the corresponding bit in DV
>>>>>>>>
>>>>>>>> *Rows should only be added via new data files*, sounds reasonable
>>>>>>>> to me !
>>>>>>>>
>>>>>>>> apart from row-lineage it also complicates the operation type
>>>>>>>> inference like here [1] as we would now
>>>>>>>> inspect the contents of these DV to see if it's an insert ?
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> https://github.com/apache/iceberg/pull/14581#discussion_r2533057189
>>>>>>>>
>>>>>>>> On Sat, Nov 22, 2025 at 4:48 AM Szehon Ho <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> It makes sense to me, it sounds like a minor clarification.  For
>>>>>>>>> v2 position deletes, code like rewrite_position_deletes may have made 
>>>>>>>>> some
>>>>>>>>> assumptions like this and would not work well if violated, maybe 
>>>>>>>>> other code
>>>>>>>>> as well.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Szehon
>>>>>>>>>
>>>>>>>>> On Fri, Nov 21, 2025 at 3:03 PM Steven Wu <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Similar weird behavior can also happen for V2 position delete
>>>>>>>>>> files with `undelete`.
>>>>>>>>>>
>>>>>>>>>> In V2, there could be multiple position delete files (say pd1,
>>>>>>>>>> pd2) associated with the same data file (say f1). Let's say pd1 
>>>>>>>>>> deletes row
>>>>>>>>>> 5 and 10 and pd2 deletes row 15.
>>>>>>>>>> 1. a new snapshot is committed with pd1 (DELETED), pd2
>>>>>>>>>> (EXISTING), and pd3 (ADDED). pd3 deletes only row 10 (undeleted row 
>>>>>>>>>> 5)
>>>>>>>>>> 2. a new snapshot is committed with pd1 (DELETED) and pd2
>>>>>>>>>> (EXISTING)
>>>>>>>>>>
>>>>>>>>>> In either case, essentially some rows are added (back) to the
>>>>>>>>>> table with lower sequence number than the new snapshot's sequence 
>>>>>>>>>> number.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just to recap the question: should the spec (v2 and v3) spell out
>>>>>>>>>> that `undelete row` is not allowed? Rows should only be added via 
>>>>>>>>>> new data
>>>>>>>>>> files.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Nov 21, 2025 at 1:09 PM Steven Wu <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> >Are we specifically stating somewhere that all row-ids should
>>>>>>>>>>> be higher than or equal to the snapshot's `first-row-id`?
>>>>>>>>>>> In my mental model the `first-row-id` is only applicable for
>>>>>>>>>>> rows that don't have a specific row-id assigned.
>>>>>>>>>>>
>>>>>>>>>>> I meant an ADDED row should have `row-id` higher than or equal
>>>>>>>>>>> to the snapshot's `first-row-id`. EXISTING or UPDATED row can have 
>>>>>>>>>>> lower
>>>>>>>>>>> row id.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Nov 21, 2025 at 1:04 PM Steven Wu <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> > Can we create a validator to prevent this from happening?
>>>>>>>>>>>>
>>>>>>>>>>>> We don't have this problem with the Java implementation.
>>>>>>>>>>>> `BaseDVFileWriter` merges the  previous DV with the new delta DV. 
>>>>>>>>>>>> So there
>>>>>>>>>>>> is no `undelete` behavior. I am not aware of any Java API to allow
>>>>>>>>>>>> "undelete". So we probably don't need to add any validation code 
>>>>>>>>>>>> in the
>>>>>>>>>>>> Java impl.
>>>>>>>>>>>>
>>>>>>>>>>>> Just thought it is good to spell it out in the spec so that
>>>>>>>>>>>> clients/engines can be clear about the expected behavior.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Nov 21, 2025 at 12:18 PM Péter Váry <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Are we specifically stating somewhere that all row-ids should
>>>>>>>>>>>>> be higher than or equal to the snapshot's `first-row-id`?
>>>>>>>>>>>>> In my mental model the `first-row-id` is only applicable for
>>>>>>>>>>>>> rows that don't have a specific row-id assigned.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Noneless, I agree that the `row-id` and the
>>>>>>>>>>>>> `last-updated-seq-num` should have changed to a new one, so we 
>>>>>>>>>>>>> can say that
>>>>>>>>>>>>> undeleting a row is not allowed because of this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can we create a validator to prevent this from happening?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025.
>>>>>>>>>>>>> nov. 21., P, 21:11):
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The undeleted row would have invalid `row-id` and
>>>>>>>>>>>>>> `last-updated-seq-num`. Since it is a new row (added back), it 
>>>>>>>>>>>>>> should have
>>>>>>>>>>>>>> the `row-id` higher than or equal to the snapshot's 
>>>>>>>>>>>>>> `first-row-id` and the
>>>>>>>>>>>>>> `last-updated-seq-number` should inherit/have the new snapshot's 
>>>>>>>>>>>>>> sequence
>>>>>>>>>>>>>> number.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Nov 21, 2025 at 11:48 AM Steven Wu <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Should we clarify the V3 spec to explicitly formid "
>>>>>>>>>>>>>>> *undelete*" of a row by unsetting the DV bit? Unsetting a
>>>>>>>>>>>>>>> DV bit essentially adds a row with lower row-id than the 
>>>>>>>>>>>>>>> snapshot's
>>>>>>>>>>>>>>> first-row-id, which would violate the row lineage spec. With the
>>>>>>>>>>>>>>> restriction, DV cardinality should be monotonically increasing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Steven
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Re: [DISCUSS] V3 spec: add monotonic requirement to data DV

Reply via email to