Hi everyone,

I’d like to propose an addition to the table specification to document
optional fields in the snapshot summary.

Currently, the snapshot summary includes a required operation field and
various optional fields. While these optional fields—such as metrics and
partition-level summaries—are supported by Java
<https://github.com/apache/iceberg/blob/549674b3fc0cdb18d6cad3e2d6320236fba8c562/core/src/main/java/org/apache/iceberg/SnapshotSummary.java#L32-L64>
and Python
<https://github.com/HonahX/iceberg-python/blob/45d611fe351f6f3847bf329aa053d890d810e2b6/pyiceberg/table/snapshots.py#L36-L60>
implementations, they are not officially documented. This creates risks of
inconsistency as other implementations and engines adopt and interact with
these fields.

I propose adding a new section to the table specification to document these
optional fields, ensuring consistent naming conventions and reducing
ambiguity across implementations. While this is the primary proposal, it
may also be worth discussing whether documenting these fields separately in
Docs/Table would provide additional flexibility for future updates.

I’d love to hear your thoughts, suggestions, or concerns about this
proposal.

Looking forward to the discussion!

Links

   - GitHub tracking issue: https://github.com/apache/iceberg/issues/11659
   - Proposal:
   
https://docs.google.com/document/d/1Gt1ZOXVXK60IGdlmt4QlyRzaZ1iCVyYUBfMJCsiz14I/edit?usp=sharing
   - PR: https://github.com/apache/iceberg/pull/11660


Best regards,
Honah

Reply via email to