Thank you for the clarification Daniel, and thank you Kevin for raising this issue!
Does that mean that we are creating component schemas that are the superset of the V1 and V2 schemas? And if so, should we remove summary and manifest-list from the required properties, and add manifests optional property to the Snapshot schema to support both V1 and V2 Summary specs? https://iceberg.apache.org/spec/#snapshots Or would creating separate component schemas for V1/V2 be a cleaner way to align the REST spec with the table spec? Sung On 2024/10/17 15:19:23 Daniel Weeks wrote: > I'm not convinced this is incorrect behavior (table spec or > implementation), but it does lend to some confusion. The 'summary' field > is optional, which means that if a summary is not provided, you do not have > an associated 'operation' field. The 'operation' field is only required in > the context of the summary, so it's actually possible for the > implementation (i.e. the tests you reference) to not have an operation. > > I think what is wrong here is that the REST spec marked the summary as > required > <https://github.com/apache/iceberg/blob/8e2eb9ac2e33ce4bac8956d4e2f099444d03c0e3/open-api/rest-catalog-open-api.yaml#L2040>, > which is inconsistent with the table spec. > > On Wed, Oct 16, 2024 at 3:52 PM Anton Okolnychyi <aokolnyc...@gmail.com> > wrote: > > > Based on [1], we never persisted the operation in the summary map. > > Instead, we persisted it as a top-level field in Java, which is actually > > NOT what the spec says. Does anyone remember cases when the operation was > > unknown? I personally don't. > > > > [1] - > > https://github.com/apache/iceberg/blob/17f1c4d2205b59c2bd877d4d31bbbef9e90979c5/core/src/main/java/org/apache/iceberg/SnapshotParser.java#L63 > > > > > > ср, 16 жовт. 2024 р. о 12:42 Kevin Liu <kevin.jq....@gmail.com> пише: > > > >> Hey folks, > >> > >> I’ve noticed a discrepancy between the Iceberg specification and the Java > >> implementation regarding the `operation` key in the `Snapshot` `summary` > >> field. > >> > >> The `Snapshot` object's `summary` dictionary includes a *required* key > >> named `operation`, as outlined in the spec describing Table Metadata and > >> Snapshots [1] and the generated OpenAPI YAML [2]. However, in the Java > >> implementation [3], `operation` is treated as optional. In contrast, it > >> remains a required field in the Python implementation [4]. > >> I also found that Java tests for `SnapshotParser` assert that the > >> `operation` field is null. [5] > >> > >> Due to this discrepancy, a user reported [6] that the `metadata.json` > >> file generated for an Iceberg table could not be read by PyIceberg, though > >> it is readable using the Iceberg Java library. > >> > >> How should we proceed from here? Should the Java library enforce this > >> requirement? Additionally, how should we handle existing `metadata.json` > >> files that were generated without this field? > >> > >> Best, > >> Kevin Liu > >> > >> [1] https://iceberg.apache.org/spec/#table-metadata-and-snapshots > >> [2] > >> https://github.com/apache/iceberg/blob/8e2eb9ac2e33ce4bac8956d4e2f099444d03c0e3/open-api/rest-catalog-open-api.yaml#L2057-L2060 > >> [3] > >> https://github.com/apache/iceberg/blob/64b36999d7ff716ae2534fb0972fcc10d22a64c2/core/src/main/java/org/apache/iceberg/SnapshotParser.java#L124 > >> [4] > >> https://github.com/apache/iceberg-python/blob/7cf0c225c3cdb32ac5e390de06b7b0e4fe7de92e/pyiceberg/table/snapshots.py#L182 > >> [5] > >> https://github.com/apache/iceberg/blob/22a6b19c2e226eacc0aa78c1f2ffbdbb168b13be/core/src/test/java/org/apache/iceberg/TestSnapshotJson.java#L52 > >> [6] https://github.com/apache/iceberg-python/issues/1106 > >> > >> >