I agree that this should have been a long in the spec, so +1 to fixing the
spec. I checked and Trino also implements this as a long.

On Mon, Jul 28, 2025 at 12:39 PM Ajantha Bhat <ajanthab...@gmail.com> wrote:

> Hi everyone,
> One of the users has raised a PR to update the table statistics (puffin
> stats) spec.
> https://github.com/apache/iceberg/pull/13513
>
> I have suggested a mailing list voting thread and also tagged the original
> spec author.
> Since there was no response from them for a long time, I am taking it
> forward.
>
> Spec <https://iceberg.apache.org/spec/#table-statistics> mentions the
> snapshot id as String whereas java
> <https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/StatisticsFile.java#L32>
> and python
> <https://github.com/apache/iceberg-python/blob/479e6639103be367e218c16e83c22bc893400eb3/pyiceberg/table/statistics.py#L35>
> implementations use Long.
> IMO, we can update the implementation to have a string to match the spec
> and handle compatibility during read.
> But the spec is very old and definitely wrong (doesn't align with regular
> snapshot id representation).
> Hence, I think updating the spec is the right option here as current
> implementations like java and python library use long for snapshot id.
>
> Please take a look at the PR and cast your vote.
>
> - Ajantha
>
>

Reply via email to