> > An explicit timestamp column adds more burden to application developers. > While some databases require an explicit column in the schema, those > databases provide triggers to auto set the column value. For Iceberg, the > snapshot timestamp is the closest to the trigger timestamp.
I wonder if we should be looking at maybe generalizing the audit column in Iceberg and letting this be configured at a table level. Other common common audit fields that some people might want without keeping snapshot history: 1. insertion time 2. Created by. 3. Updated by. On Tue, Dec 9, 2025 at 2:23 PM Steven Wu <[email protected]> wrote: > Ryan, thanks a lot for the feedback! > > Regarding the concern for reliable timestamps, we are not proposing using > timestamps for ordering. With NTP in modern computers, they are generally > reliable enough for the intended use cases. Also some environments may have > stronger clock service, like Spanner TrueTime service > <https://docs.cloud.google.com/spanner/docs/true-time-external-consistency> > . > > > joining to timestamps from the snapshots metadata table. > > As you also mentioned, it depends on the snapshot history, which is often > retained for a few days due to performance reasons. > > > embedding a timestamp in DML (like `current_timestamp`) rather than > relying on an implicit one from table metadata. > > An explicit timestamp column adds more burden to application developers. > While some databases require an explicit column in the schema, those > databases provide triggers to auto set the column value. For Iceberg, the > snapshot timestamp is the closest to the trigger timestamp. > > Also, the timestamp set during computation (like streaming ingestion or > relative long batch computation) doesn't capture the time the rows/files > are added to the Iceberg table in a batch fashion. > > > And for those use cases, you could also keep a longer history of > snapshot timestamps, like storing a catalog's event log for long-term > access to timestamp info > > this is not really consumable by joining the regular table query with > catalog event log. I would also imagine catalog event log is capped at > shorter retention (maybe a few months) compared to data retention (could be > a few years). > > > > On Tue, Dec 9, 2025 at 1:32 PM Ryan Blue <[email protected]> wrote: > >> I don't think it is a good idea to expose timestamps at the row level. >> Timestamps in metadata that would be carried down to the row level already >> confuse people that expect them to be useful or reliable, rather than for >> debugging. I think extending this to the row level would only make the >> problem worse. >> >> You can already get this information by projecting the last updated >> sequence number, which is reliable, and joining to timestamps from the >> snapshots metadata table. Of course, the drawback there is losing the >> timestamp information when snapshots expire, but since it isn't reliable >> anyway I'd be fine with that. >> >> Some of the use cases, like auditing and compliance, are probably better >> served by embedding a timestamp in DML (like `current_timestamp`) rather >> than relying on an implicit one from table metadata. And for those use >> cases, you could also keep a longer history of snapshot timestamps, like >> storing a catalog's event log for long-term access to timestamp info. I >> think that would be better than storing it at the row level. >> >> On Mon, Dec 8, 2025 at 3:46 PM Steven Wu <[email protected]> wrote: >> >>> Hi, >>> >>> For V4 spec, I have a small proposal [1] to expose the row timestamp >>> concept that can help with many use cases like temporal queries, latency >>> tracking, TTL, auditing and compliance. >>> >>> This *_last_updated_timestamp_ms * metadata column behaves very >>> similarly to the *_last_updated_sequence_number* for row lineage. >>> >>> - Initially, it inherits from the snapshot timestamp. >>> - During rewrite (like compaction), its values are persisted in the >>> data files. >>> >>> Would love to hear what you think. >>> >>> Thanks, >>> Steven >>> >>> [1] >>> https://docs.google.com/document/d/1cXr_RwEO6o66S8vR7k3NM8-bJ9tH2rkh4vSdMXNC8J8/edit?usp=sharing >>> >>> >>>
