Hey Yuya,

Thanks for raising this. When possible, I'd like to avoid additional flags
to avoid confusion. For example, in the PR the purge flag is only taken
into account when you remove the main ref. I would be leaning towards
keeping the snapshot-log, instead of purging it. The snapshot-log will then
be cleaned up when the snapshots are expired, as stated in the spec
<https://iceberg.apache.org/spec/?column-projection#table-metadata-fields>.
This would also work with Point in Time reads
<https://iceberg.apache.org/spec/?column-projection#point-in-time-reads-time-travel>,
assuming that the snapshots carry the schema-id.

Kind regards,
Fokko

Op zo 15 dec 2024 om 13:20 schreef Yuya Ebihara <
yuya.ebih...@starburstdata.com>:

> REST catalog resets snapshot-logs when replacing tables as I filed
> https://github.com/apache/iceberg/issues/11777.
>
> The cause is `RESTSessionCatalog.Builder#replaceTransaction` calls
> `TableMetadata#buildReplacement` which adds a `RemoveSnapshotRef` request,
> and then REST `CatalogHandlers.commit` clears snapshot-logs during the
> request processing.
>
> I propose adding a new flag (e.g. boolean purge) to `RemoveSnapshotRef` so
> the catalog can skip clearing snapshot-logs.
> https://github.com/apache/iceberg/pull/11779 is the proposed change.
>
> BR,
> Yuya
>

Reply via email to