Hi everyone,

Thanks for raising this. I believe the Java implementation and the spec are
still in sync since for V1 we always write schema
<https://github.com/apache/iceberg/blob/8f6ebb5b36a0263edfcb04e0c104b26225f95b07/core/src/main/java/org/apache/iceberg/TableMetadataParser.java#L176-L182>
and partition-spec
<https://github.com/apache/iceberg/blob/8f6ebb5b36a0263edfcb04e0c104b26225f95b07/core/src/main/java/org/apache/iceberg/TableMetadataParser.java#L192-L196>.
According to the PR <https://github.com/apache/iceberg-rust/pull/1087>, it
looks like the problem lies in Glue. I believe making the fields in the
spec optional instead of required isn't the right approach, since this
leads us to a path where we would update the spec for an
implementation that is not adhering to it. Since the field is deprecated, I
think this is a special case and we could add a section to Implementation
Notes <https://iceberg.apache.org/spec/#appendix-f-implementation-notes> that
states that the schemas and current-schema-id take priority over schema.

Curious to hear what others think.

Kind regards,
Fokko

Op di 18 mrt 2025 om 04:20 schreef Kevin Liu <kevinjq...@apache.org>:

> Hi Renjie,
>
> Thanks for raising this! For context, here's the iceberg-rust PR
> <https://github.com/apache/iceberg-rust/pull/1087> where we found the
> inconsistency.
> I am also in favor of approach 2 for backwards compatibility reasons.
> Perhaps, we can include this behavior in "Appendix E: Format version
> changes" <https://iceberg.apache.org/spec/#version-2>, under the "Reading
> v1 metadata for v2" section.
>
> Best,
> Kevin Liu
>
>
> On Mon, Mar 17, 2025 at 7:17 PM Renjie Liu <liurenjie2...@gmail.com>
> wrote:
>
>> Hi:
>>
>> We found an inconsistency between java implementation and spec about
>> partition-spec and schema in v1 table.
>>
>> In spec, it says in v1 table partition-spec and schema are required but
>> deprecated: https://iceberg.apache.org/spec/#table-metadata
>>
>> While in java implementation, they are both optional. For schema, it
>> checks for `schemas` first, and looks up the current table schema by
>> `schema-id`.  Otherwise it looks for a `schema` field. Similar things
>> happen for `partition spec`.
>>
>> We found this problem when implementing iceberg-rust. To resolve this
>> inconsistency, there are two approaches:
>>
>> 1. Modify java implementation to match spec, e.g. force checking `schema`
>> and `partition spec` field.
>> 2. Update spec to claim that both `schema` and `partition spec` are
>> optional for v1 table.
>>
>> Personally I'm in favor of approach 2 as it keeps backward compatibility
>> and seems a more reasonable solution to me.
>>
>> Looking forward to hearing from you!
>>
>

Reply via email to