Hello everyone,

I’d like to ask for feedback on whether adding a `table_properties_log`
metadata table is a direction worth pursuing.

PR: https://github.com/apache/iceberg/pull/16859

This PR adds a read-only metadata table that exposes the history of table
properties from retained Iceberg metadata files.
In the current version of Apache Iceberg, if users want to understand when
a table property changed, they need to follow the metadata log/previous
metadata files and inspect `metadata.json` files manually.
The PR enables to retain table properties for each snapshot version through
the existing metadata table mechanism.

The proposed table returns one row per retained metadata version with:
`timestamp`, `file`, `latest_snapshot_id` and `properties`.

*Example use cases*:

   - Audit/RCA: check whether properties like `gc.enabled` or metadata
   cleanup settings were enabled before a maintenance operation.
   - Debugging regressions: correlate behavior changes with updates to
   properties like `write.update|delete|merge.mode`,
   `write.target-file-size-bytes` or `write.distribution-mode`.


Note that the PR does NOT change the table spec or write path. It only
exposes information that is already retained in metadata files, and makes
it available through Spark/Flink metadata table syntax.

*The primary questions* I’d like feedback on are below, but any other
feedback or concerns are also welcome:

1. Is this metadata table useful enough to add?
2. Is `table_properties_log` the right user-facing name?
3. Is the proposed schema reasonable?
4. Is reading retained previous metadata files acceptable for this
read-only metadata table?

If this direction makes sense, I’d also appreciate review on the PR. If the
community thinks this is too narrow or not worth adding, I’m happy to close
it or rework the proposal.

Best regards,
Tom

Reply via email to