Thanks Fokko, the historical context!
Quick check that we're aligned, since I think we may be closer than
it reads:
My PR leaves the result type table as `int` -- no change to the
transform table, no impact on hour/month/etc., no change to the
type model.
What the PR clarifies is the Avro encoding used when serializing a
`day` partition field into a manifest. Empirically today, Java,
PyIceberg, and Rust all write `{ "type": "int", "logicalType": "date" }`
there (TypeToSchema in Java, DayTransform.result_type in PyIceberg,
Transform::Day.result_type in Rust all produce a Date). Only
iceberg-go produces plain Avro `int`. The PR codifies the de facto
writer behavior as SHOULD and makes reader tolerance MUST.
If your "stick with int" also covers the Avro annotation, then we'd
effectively be reverting three writers and orphaning every existing
manifest, which I don't think decent path, it's quite a big change
for small benefits.
Either way, super happy to adjust the spec adjustment, the goal is to
stop this tradition of re-litigating issue every year, by misreading
this part of the spec.
Best,
Andrei
On Wed, May 20, 2026 at 6:37 PM Fokko Driesprong <[email protected]> wrote:
> Thanks for briging this up Kevin, a gift that keeps on giving :)
> https://github.com/apache/iceberg/issues/10616#issuecomment-2200191427
>
> 1. I think we should stick with the int type as defined in the spec.
> 2. It feels to me that some readers are more permissive here than others.
> I believe some allow reading date as an int without throwing. Practically,
> readers should read both.
> 3. Unfortunally, I think this is water under the bridge. As shown above in
> the GitHub Issue, we went back and forth, so I don't see a lot of value in
> switching this to date. All OSS implementations handle this as an int
> internally, and this also aligns with hour/month/etc.
>
> Hope this historical context helps.
>
> Kind regards,
> Fokko
>
>
> On 2026/05/20 16:33:51 Andrei Tserakhau via dev wrote:
> > Here is a fast follow with a PR:
> > https://github.com/apache/iceberg/pull/16446
> >
> > Best,
> > Andrei
> >
> > On Wed, May 20, 2026 at 6:11 PM Andrei Tserakhau <
> > [email protected]> wrote:
> >
> > > Thanks for raising this, Kevin.
> > >
> > > Speaking as an iceberg-go maintainer, even though Go is the
> > > implementation that has to move, I'd vote:
> > >
> > > 1. Writers SHOULD emit { "type": "int", "logicalType": "date" }.
> > > 2. Readers MUST accept both plain `int` and `int` annotated with
> > > `logicalType: date`.
> > > 3. Keep the transform result type table as-is (`int` as the logical
> > > Iceberg type). Don't change it to `date`. Add a separate, normative
> > > manifest-encoding clause so projection and expression-evaluation
> > > semantics that depend on the type model stay untouched.
> > >
> > > Reasoning: when Java, PyIceberg, and Rust all write logical `date`,
> > > that's the de facto wire format. Forcing them to switch to plain `int`
> > > to match a literal reading of the transform table would churn three
> > > implementations and leave every existing manifest "non-conforming"
> > > forever. Aligning Go with the dominant writer convention costs one
> > > implementation change (PR #915 already proposes it) and zero historical
> > > churn.
> > >
> > > The underlying ambiguity is that "result type" (logical Iceberg type)
> > > and "Avro manifest encoding" (wire format) were conflated. Separating
> > > them in spec text removes the ambiguity without changing the type
> > > system.
> > >
> > > Happy to drive the spec PR and then iceberg-go writer + reader
> > > alignment.
> > >
> > > Best,
> > > Andrei
> > >
> > > On Tue, May 19, 2026 at 5:45 PM Kevin Liu <[email protected]>
> wrote:
> > >
> > >> Hi all,
> > >>
> > >> I'd like to invite the community to discuss a spec ambiguity in Apache
> > >> Iceberg that has caused some confusion across implementations. We've
> seen
> > >> this come up in Python, Rust, and now Go.
> > >>
> > >> The issue: the spec documents the `day` partition transform's result
> type
> > >> as plain `int`, but Java, PyIceberg, and Rust all write manifest
> partition
> > >> fields using Avro's logical `date` type. Go currently writes plain
> `int`,
> > >> which is the strict reading of the spec. Since both forms have the
> same
> > >> physical representation, the difference is only the Avro schema
> annotation
> > >> -- but it's worth clarifying the spec so all implementations are
> aligned.
> > >>
> > >> The full analysis, including a breakdown of each implementation's
> > >> writer/reader behavior and proposed resolution options, is here:
> > >> https://github.com/apache/iceberg/issues/16414
> > >>
> > >> At a high level, the questions for the community are:
> > >> 1. What should implementations write: Avro `int` (plain integer) or
> Avro
> > >> `date` (integer with a date logical type)?
> > >> 2. Should implementations be required to read both forms, or just
> > >> encouraged to?
> > >> 3. Should the spec's transform result type table be updated from
> `int` to
> > >> `date`?
> > >>
> > >> I'd love to hear your thoughts. Thanks!
> > >>
> > >> Best,
> > >> Kevin Liu
> > >>
> > >
> >
>