Ryan, thanks for the reply. I have created issue (https://github.com/apache/incubator-iceberg/issues/181 <https://github.com/apache/incubator-iceberg/issues/181>) and will try to come up with the PR.
Kind regards, Arina > On May 6, 2019, at 9:14 PM, Ryan Blue <rb...@netflix.com.INVALID> wrote: > > Arina, > > So far, we’ve kept these around to help troubleshoot format problems. It has > been a fairly cheap way to be able to see exactly what happened to the table. > But we’re also getting to the point where we no longer need to refer back to > them and should think about adding a way to remove them. Technically, you > don’t need to keep them around once you’ve committed the new version, but an > easy way to roll back is to change the database pointer so it is nice to keep > a few of them. > > I think we can probably build a way to expire old metadata versions by > looking for a naming pattern, like v(num)-(uuid).metadata.json[.gz]. Would > you like to add an issue and maybe a PR for this? > > rb > > > On Sat, May 4, 2019 at 7:43 AM Arina Yelchiyeva <arina.yelchiy...@gmail.com > <mailto:arina.yelchiy...@gmail.com>> wrote: > Hi all, > > Iceberg table has expire snapshots notion, which helps to delete snapshots > that are no longer needed along with data files, manifest and manifest lists: > > // clean up the expired snapshots: > // 1. Get a list of the snapshots that were removed > // 2. Delete any data files that were deleted by those snapshots and > are not in the table > // 3. Delete any manifests that are no longer used by current > snapshots > // 4. Delete the manifest lists > > But we also have table metadata which is stored in JSON. New metadata version > is created for each metadata change. > I was assuming that with snapshot expiration operation, unneeded metadata > files will also be deleted but they are not. > > My concern is that having JSON file for each metadata change with time may > consume lots of space (setting `iceberg.compress.metadata` to true can help > but not for long). > Is there an option to expire table metadata versions as well? > > Kind regards, > Arina > > > -- > Ryan Blue > Software Engineer > Netflix