Thanks Tom for putting this together, reconstructing when a property changed by hand-diffing metadata.json files is genuinely painful today, so the underlying need resonates.
The question I'd focus the discussion on is your #4, since it's what makes this table structurally different from the rest. The metadata-log entries retain only a timestamp and a file path (TableMetadata.MetadataLogEntry), not the property map, so surfacing historical properties necessarily means opening and parsing each previous metadata.json. No existing metadata table re-reads historical root metadata that way, the ones that do I/O (files, manifests, entries) read manifests / manifest lists for a given snapshot, not previous metadata.json files. So this would set a new precedent: a metadata table whose scan cost scales with the number of retained metadata versions. *Two properties* of that approach I'd want pinned down before the schema or name, because they're independent of both: - Read cost. In the current PR the rows are materialized when the scan is planned, reading up to write.metadata.previous-versions-max (default 100) files sequentially, so a single query expands into that many object-store reads before it returns any row. - Failure mode. TableMetadataParser.read throws if a referenced file can't be read, and there's no per-row handling, so one absent or unreadable metadata file fails the whole scan. metadata_log_entries never has this failure mode because it builds its rows from in-memory metadata and does no such reads. If the community decides property history deserves first-class support, it may be worth revisiting the "Alternatives Considered": carrying the values in the metadata-log entry or a dedicated structure keeps the read path I/O-free, at the cost of the write/spec change you noted. That's the trade-off I'd most like to hear others weigh in on. On naming and schema I don't have strong objections. One thing worth making explicit in the docs is that the table emits one row per retained metadata version, not per property change, so consecutive rows are frequently identical. Overall I think it's worth pursuing if the read-cost and failure-handling story is nailed down. Regards, Tanmay Rauth On Tue, Jun 30, 2026 at 12:39 AM Tomohiro Tanaka <[email protected]> wrote: > Hello everyone, > > I’d like to ask for feedback on whether adding a `table_properties_log` > metadata table is a direction worth pursuing. > > PR: https://github.com/apache/iceberg/pull/16859 > > This PR adds a read-only metadata table that exposes the history of table > properties from retained Iceberg metadata files. > In the current version of Apache Iceberg, if users want to understand when > a table property changed, they need to follow the metadata log/previous > metadata files and inspect `metadata.json` files manually. > The PR enables to retain table properties for each snapshot version > through the existing metadata table mechanism. > > The proposed table returns one row per retained metadata version with: > `timestamp`, `file`, `latest_snapshot_id` and `properties`. > > *Example use cases*: > > - Audit/RCA: check whether properties like `gc.enabled` or metadata > cleanup settings were enabled before a maintenance operation. > - Debugging regressions: correlate behavior changes with updates to > properties like `write.update|delete|merge.mode`, > `write.target-file-size-bytes` or `write.distribution-mode`. > > > Note that the PR does NOT change the table spec or write path. It only > exposes information that is already retained in metadata files, and makes > it available through Spark/Flink metadata table syntax. > > *The primary questions* I’d like feedback on are below, but any other > feedback or concerns are also welcome: > > 1. Is this metadata table useful enough to add? > 2. Is `table_properties_log` the right user-facing name? > 3. Is the proposed schema reasonable? > 4. Is reading retained previous metadata files acceptable for this > read-only metadata table? > > If this direction makes sense, I’d also appreciate review on the PR. If > the community thinks this is too narrow or not worth adding, I’m happy to > close it or rework the proposal. > > Best regards, > Tom >
