Based on [1], we never persisted the operation in the summary map. Instead,
we persisted it as a top-level field in Java, which is actually NOT what
the spec says. Does anyone remember cases when the operation was unknown? I
personally don't.

[1] -
https://github.com/apache/iceberg/blob/17f1c4d2205b59c2bd877d4d31bbbef9e90979c5/core/src/main/java/org/apache/iceberg/SnapshotParser.java#L63


ср, 16 жовт. 2024 р. о 12:42 Kevin Liu <kevin.jq....@gmail.com> пише:

> Hey folks,
>
> I’ve noticed a discrepancy between the Iceberg specification and the Java
> implementation regarding the `operation` key in the `Snapshot` `summary`
> field.
>
> The `Snapshot` object's `summary` dictionary includes a *required* key
> named `operation`, as outlined in the spec describing Table Metadata and
> Snapshots [1] and the generated OpenAPI YAML [2]. However, in the Java
> implementation [3], `operation` is treated as optional. In contrast, it
> remains a required field in the Python implementation [4].
> I also found that Java tests for `SnapshotParser` assert that the
> `operation` field is null. [5]
>
> Due to this discrepancy, a user reported [6] that the `metadata.json` file
> generated for an Iceberg table could not be read by PyIceberg, though it is
> readable using the Iceberg Java library.
>
> How should we proceed from here? Should the Java library enforce this
> requirement? Additionally, how should we handle existing `metadata.json`
> files that were generated without this field?
>
> Best,
> Kevin Liu
>
> [1] https://iceberg.apache.org/spec/#table-metadata-and-snapshots
> [2]
> https://github.com/apache/iceberg/blob/8e2eb9ac2e33ce4bac8956d4e2f099444d03c0e3/open-api/rest-catalog-open-api.yaml#L2057-L2060
> [3]
> https://github.com/apache/iceberg/blob/64b36999d7ff716ae2534fb0972fcc10d22a64c2/core/src/main/java/org/apache/iceberg/SnapshotParser.java#L124
> [4]
> https://github.com/apache/iceberg-python/blob/7cf0c225c3cdb32ac5e390de06b7b0e4fe7de92e/pyiceberg/table/snapshots.py#L182
> [5]
> https://github.com/apache/iceberg/blob/22a6b19c2e226eacc0aa78c1f2ffbdbb168b13be/core/src/test/java/org/apache/iceberg/TestSnapshotJson.java#L52
> [6] https://github.com/apache/iceberg-python/issues/1106
>
>

Reply via email to