Thanks everyone, especially Amogh and Fokko for providing other views on
this change. I think your viewpoints are definitely valid although I think
currently we should be working on moving things away from Snapshot Summary
if at all possible. I think it's better if the information there is never
critical for the correct functioning of an Iceberg implementation and I
think in the future it would probably be better if we got operation out of
there as well.

That said I think we have consensus and I'll merge this change after the
last review comments.

On Fri, Jan 17, 2025 at 4:22 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> +1 (non binding)
>
> Regards
> JB
>
> On Wed, Jan 15, 2025 at 5:59 PM Russell Spitzer
> <russell.spit...@gmail.com> wrote:
> >
> > Hi Everyone!
> >
> > PR: https://github.com/apache/iceberg/pull/11976/files
> >
> > Split out from #11948
> >
> > Working on the row-lineage implementation made it clear that we needed a
> way to get information from the Snapshot object propagated into the
> Metadata layer. Specifically we need to know the count of all newly added
> rows in that Snapshot to change the last-row-id of the TableMetadata. While
> we can potentially read this from Snapshot Summary, it would be a bit odd
> to have a requirement on key value pair within Snapshot summary. We could
> also potentially re-read and calculate the number of added rows, but this
> would require re-opening the manifest list itself.
> >
> > I believe it makes more sense to formally have added-rows as an optional
> field within the Snapshot itself so we can make it clear in the spec this
> value is expected to be stored within the Snapshot metadata if row-lineage
> is enabled.
> >
> >
> > Please take a look at the PR and signal vote or not you approve of
> adding this additional field to Snapshot
>

Reply via email to