Re: [DISCUSS] row timestamp proposal

Ryan Blue Tue, 09 Dec 2025 13:31:03 -0800

I don't think it is a good idea to expose timestamps at the row level.
Timestamps in metadata that would be carried down to the row level already
confuse people that expect them to be useful or reliable, rather than for
debugging. I think extending this to the row level would only make the
problem worse.

You can already get this information by projecting the last updated
sequence number, which is reliable, and joining to timestamps from the
snapshots metadata table. Of course, the drawback there is losing the
timestamp information when snapshots expire, but since it isn't reliable
anyway I'd be fine with that.

Some of the use cases, like auditing and compliance, are probably better
served by embedding a timestamp in DML (like `current_timestamp`) rather
than relying on an implicit one from table metadata. And for those use
cases, you could also keep a longer history of snapshot timestamps, like
storing a catalog's event log for long-term access to timestamp info. I
think that would be better than storing it at the row level.

On Mon, Dec 8, 2025 at 3:46 PM Steven Wu <[email protected]> wrote:

> Hi,
>
> For V4 spec, I have a small proposal [1] to expose the row timestamp
> concept that can help with many use cases like temporal queries, latency
> tracking, TTL, auditing and compliance.
>
> This *_last_updated_timestamp_ms * metadata column behaves very similarly
> to the *_last_updated_sequence_number* for row lineage.
>
>    - Initially, it inherits from the snapshot timestamp.
>    - During rewrite (like compaction), its values are persisted in the
>    data files.
>
> Would love to hear what you think.
>
> Thanks,
> Steven
>
> [1]
> https://docs.google.com/document/d/1cXr_RwEO6o66S8vR7k3NM8-bJ9tH2rkh4vSdMXNC8J8/edit?usp=sharing
>
>
>

Re: [DISCUSS] row timestamp proposal

Reply via email to