Hi Barron, we've noticed the same issue as well since this PR was merged in to 
introduce schema versions: https://github.com/apache/iceberg/pull/2096

There's a closed issue where folks were discussions options in remediating this 
problem, that also has links to other related PRs and Issues: 
https://github.com/apache/iceberg/issues/5219. Should we reopen this issue and 
converge our discussion points on the main discussion thread?

Sung

From: barron....@twosigma.com At: 02/20/24 12:34:57 UTC-5:00To:  
dev@iceberg.apache.org
Subject: Table Schema History Pruning

Hi folks,

I have a few questions regarding the schema history of an Iceberg table.

The table metadata file keeps track of every table schema version (at least in 
v2). Depending on the size of the schema, this history can become large in 
terms of byte size.


  1.  Is removing a schema from the history correct when the table does not 
have a snapshot referencing that schema version?
  2.  Does the Iceberg spec guarantee that the history is not pruned?
  3.  Are there any plans for Iceberg to support pruning the schema history?

Thanks,
Barron Wei


Reply via email to