Thanks for briging this up Kevin, a gift that keeps on giving :) https://github.com/apache/iceberg/issues/10616#issuecomment-2200191427
1. I think we should stick with the int type as defined in the spec. 2. It feels to me that some readers are more permissive here than others. I believe some allow reading date as an int without throwing. Practically, readers should read both. 3. Unfortunally, I think this is water under the bridge. As shown above in the GitHub Issue, we went back and forth, so I don't see a lot of value in switching this to date. All OSS implementations handle this as an int internally, and this also aligns with hour/month/etc. Hope this historical context helps. Kind regards, Fokko On 2026/05/20 16:33:51 Andrei Tserakhau via dev wrote: > Here is a fast follow with a PR: > https://github.com/apache/iceberg/pull/16446 > > Best, > Andrei > > On Wed, May 20, 2026 at 6:11 PM Andrei Tserakhau < > [email protected]> wrote: > > > Thanks for raising this, Kevin. > > > > Speaking as an iceberg-go maintainer, even though Go is the > > implementation that has to move, I'd vote: > > > > 1. Writers SHOULD emit { "type": "int", "logicalType": "date" }. > > 2. Readers MUST accept both plain `int` and `int` annotated with > > `logicalType: date`. > > 3. Keep the transform result type table as-is (`int` as the logical > > Iceberg type). Don't change it to `date`. Add a separate, normative > > manifest-encoding clause so projection and expression-evaluation > > semantics that depend on the type model stay untouched. > > > > Reasoning: when Java, PyIceberg, and Rust all write logical `date`, > > that's the de facto wire format. Forcing them to switch to plain `int` > > to match a literal reading of the transform table would churn three > > implementations and leave every existing manifest "non-conforming" > > forever. Aligning Go with the dominant writer convention costs one > > implementation change (PR #915 already proposes it) and zero historical > > churn. > > > > The underlying ambiguity is that "result type" (logical Iceberg type) > > and "Avro manifest encoding" (wire format) were conflated. Separating > > them in spec text removes the ambiguity without changing the type > > system. > > > > Happy to drive the spec PR and then iceberg-go writer + reader > > alignment. > > > > Best, > > Andrei > > > > On Tue, May 19, 2026 at 5:45 PM Kevin Liu <[email protected]> wrote: > > > >> Hi all, > >> > >> I'd like to invite the community to discuss a spec ambiguity in Apache > >> Iceberg that has caused some confusion across implementations. We've seen > >> this come up in Python, Rust, and now Go. > >> > >> The issue: the spec documents the `day` partition transform's result type > >> as plain `int`, but Java, PyIceberg, and Rust all write manifest partition > >> fields using Avro's logical `date` type. Go currently writes plain `int`, > >> which is the strict reading of the spec. Since both forms have the same > >> physical representation, the difference is only the Avro schema annotation > >> -- but it's worth clarifying the spec so all implementations are aligned. > >> > >> The full analysis, including a breakdown of each implementation's > >> writer/reader behavior and proposed resolution options, is here: > >> https://github.com/apache/iceberg/issues/16414 > >> > >> At a high level, the questions for the community are: > >> 1. What should implementations write: Avro `int` (plain integer) or Avro > >> `date` (integer with a date logical type)? > >> 2. Should implementations be required to read both forms, or just > >> encouraged to? > >> 3. Should the spec's transform result type table be updated from `int` to > >> `date`? > >> > >> I'd love to hear your thoughts. Thanks! > >> > >> Best, > >> Kevin Liu > >> > > >
