Hi Iceberg Community,

There were recent additions to RemoveSnapshots to expire the unused
partition specs and schemas. This is controlled by a flag called
'cleanExpiredMetadata' and has a default value 'false'. Additionally, Spark
<https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java#L147>
and Flink
<https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/ExpireSnapshotsProcessor.java#L86>
don't offer a way to set this flag currently.

1) Default value of RemoveSnapshots.cleanExpiredMetadata
I'm wondering if it's desired by the community to default this flag to
true. The effect of that would be that each snapshot expiration would also
clean up the unused partition specs and schemas too. This functionality is
quite new so this might need some extra confidence by the community before
turning on by default but I think it's worth a consideration.

2) Spark and Flink to support setting this flag
I think it makes sense to add support in Spark's ExpireSnapshotProcedure
<https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/ExpireSnapshotsProcedure.java#L116>
and ExpireSnapshotsSparkAction
<https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java#L147>
also to Flink's ExpireSnapshotsProcessor
<https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/operator/ExpireSnapshotsProcessor.java#L58>
and ExpireSnapshots
<https://github.com/apache/iceberg/blob/c02ebe4740b22d6f5a78b636aea2d918037b2751/flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/maintenance/api/ExpireSnapshots.java#L44>
to allow setting this flag based on (user) inputs.

WDYT?

Regards,
Gabor

Reply via email to