Thanks for briging this up Kevin, a gift that keeps on giving :) 
https://github.com/apache/iceberg/issues/10616#issuecomment-2200191427

1. I think we should stick with the int type as defined in the spec.
2. It feels to me that some readers are more permissive here than others. I 
believe some allow reading date as an int without throwing. Practically, 
readers should read both.
3. Unfortunally, I think this is water under the bridge. As shown above in the 
GitHub Issue, we went back and forth, so I don't see a lot of value in 
switching this to date. All OSS implementations handle this as an int 
internally, and this also aligns with hour/month/etc.

Hope this historical context helps.

Kind regards,
Fokko


On 2026/05/20 16:33:51 Andrei Tserakhau via dev wrote:
> Here is a fast follow with a PR:
> https://github.com/apache/iceberg/pull/16446
> 
> Best,
> Andrei
> 
> On Wed, May 20, 2026 at 6:11 PM Andrei Tserakhau <
> [email protected]> wrote:
> 
> > Thanks for raising this, Kevin.
> >
> > Speaking as an iceberg-go maintainer, even though Go is the
> > implementation that has to move, I'd vote:
> >
> > 1. Writers SHOULD emit { "type": "int", "logicalType": "date" }.
> > 2. Readers MUST accept both plain `int` and `int` annotated with
> >    `logicalType: date`.
> > 3. Keep the transform result type table as-is (`int` as the logical
> >    Iceberg type). Don't change it to `date`. Add a separate, normative
> >    manifest-encoding clause so projection and expression-evaluation
> >    semantics that depend on the type model stay untouched.
> >
> > Reasoning: when Java, PyIceberg, and Rust all write logical `date`,
> > that's the de facto wire format. Forcing them to switch to plain `int`
> > to match a literal reading of the transform table would churn three
> > implementations and leave every existing manifest "non-conforming"
> > forever. Aligning Go with the dominant writer convention costs one
> > implementation change (PR #915 already proposes it) and zero historical
> > churn.
> >
> > The underlying ambiguity is that "result type" (logical Iceberg type)
> > and "Avro manifest encoding" (wire format) were conflated. Separating
> > them in spec text removes the ambiguity without changing the type
> > system.
> >
> > Happy to drive the spec PR and then iceberg-go writer + reader
> > alignment.
> >
> > Best,
> > Andrei
> >
> > On Tue, May 19, 2026 at 5:45 PM Kevin Liu <[email protected]> wrote:
> >
> >> Hi all,
> >>
> >> I'd like to invite the community to discuss a spec ambiguity in Apache
> >> Iceberg that has caused some confusion across implementations. We've seen
> >> this come up in Python, Rust, and now Go.
> >>
> >> The issue: the spec documents the `day` partition transform's result type
> >> as plain `int`, but Java, PyIceberg, and Rust all write manifest partition
> >> fields using Avro's logical `date` type. Go currently writes plain `int`,
> >> which is the strict reading of the spec. Since both forms have the same
> >> physical representation, the difference is only the Avro schema annotation
> >> -- but it's worth clarifying the spec so all implementations are aligned.
> >>
> >> The full analysis, including a breakdown of each implementation's
> >> writer/reader behavior and proposed resolution options, is here:
> >> https://github.com/apache/iceberg/issues/16414
> >>
> >> At a high level, the questions for the community are:
> >> 1. What should implementations write: Avro `int` (plain integer) or Avro
> >> `date` (integer with a date logical type)?
> >> 2. Should implementations be required to read both forms, or just
> >> encouraged to?
> >> 3. Should the spec's transform result type table be updated from `int` to
> >> `date`?
> >>
> >> I'd love to hear your thoughts. Thanks!
> >>
> >> Best,
> >> Kevin Liu
> >>
> >
> 

Reply via email to