Hi Suraj, I just answered on slack, but I'll copy the replies here for everyone that's subscribed to the dev list:
1) Yes, there are use cases around this. To assist, we're planning on adding named snapshots so you don't keep complete history. Instead, you should keep a selection of snapshots. 2) It is fine to keep snapshots for a long period of time. Part of the purpose is to allow you to time travel and we've known about the use case of keeping a labelled version around (e.g. what you trained a model with) for a long time. 3) RewriteDataFiles will rewrite the files from one snapshot and produce another. If you're keeping around old snapshots this wouldn't change them. Although you probably could go rewrite those snapshots if you wanted to. I hope that helps! Ryan On Sun, Jun 13, 2021 at 9:47 AM Suraj Chandran <chandransu...@gmail.com> wrote: > Hi there, > > (Had asked on Slack, trying here as well) > > The documentation proposes "regularly expiring snapshots is recommended to > delete data files that are no longer needed, and to keep the size of table > metadata small". > I had a few questions around that: > 1) Are there people/usecases who are keeping snapshots for a long history > of time, like for decades? This would help people manage/find "back dated > corrections" in data. > 2) Are snapshots even meant for keeping history for such long periods of > time. > 3) Would regular rewriteDataFiles help in such cases (by how much?) > > Thanks, > Suraj > -- Ryan Blue