+1 for using long type for snapshotId On Mon, Jul 28, 2025 at 6:24 AM Péter Váry <peter.vary.apa...@gmail.com> wrote:
> +1 for long > > Given that it is implemented as a long in every known implementation, we > might not even want to handle the type difference in code > > Eduard Tudenhöfner <etudenhoef...@apache.org> ezt írta (időpont: 2025. > júl. 28., H, 12:47): > >> I agree that this should have been a long in the spec, so +1 to fixing >> the spec. I checked and Trino also implements this as a long. >> >> On Mon, Jul 28, 2025 at 12:39 PM Ajantha Bhat <ajanthab...@gmail.com> >> wrote: >> >>> Hi everyone, >>> One of the users has raised a PR to update the table statistics (puffin >>> stats) spec. >>> https://github.com/apache/iceberg/pull/13513 >>> >>> I have suggested a mailing list voting thread and also tagged the >>> original spec author. >>> Since there was no response from them for a long time, I am taking it >>> forward. >>> >>> Spec <https://iceberg.apache.org/spec/#table-statistics> mentions the >>> snapshot id as String whereas java >>> <https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/StatisticsFile.java#L32> >>> and python >>> <https://github.com/apache/iceberg-python/blob/479e6639103be367e218c16e83c22bc893400eb3/pyiceberg/table/statistics.py#L35> >>> implementations use Long. >>> IMO, we can update the implementation to have a string to match the spec >>> and handle compatibility during read. >>> But the spec is very old and definitely wrong (doesn't align with >>> regular snapshot id representation). >>> Hence, I think updating the spec is the right option here as current >>> implementations like java and python library use long for snapshot id. >>> >>> Please take a look at the PR and cast your vote. >>> >>> - Ajantha >>> >>>