Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread Manu Zhang
Hi Ryan, I think you could achieve what you're looking for by setting the age to 1 > ms and the minimum number of snapshots to keep I'm not sure how I can set the minimum number of snapshots to keep for tables with different update frequencies. For a daily updated table, I might set it to 5. How

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread Russell Spitzer
I do think this comes up a lot and is one of the more confusing things about the snapshot expiration. Definitely one of my most answered questions is: "When I set min-snapshots to 1, why do I not get only 1 snapshot." I agree adding another behavior may be even more confusing but I wouldn't be oppo

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread rdb...@gmail.com
I think you could achieve what you're looking for by setting the age to 1 ms and the minimum number of snapshots to keep. Then snapshot expiration would always expire all snapshots other than the min number, getting you what you want. It probably wouldn't make sense to set a maximum as well. Right

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-21 Thread Daniel Weeks
Hey Manu, I think I understand what you're trying to achieve here and I feel like the most important part is to have an updated version of the retention procedure to clearly state how this interacts with the other settings as part of the

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-17 Thread Manu Zhang
The intention is to set an upper limit for table size while keeping as much snapshots as possible. Setting max-snapshot-age-ms to a small value will lose history for some tables while setting min-snapshots-to-keep to a medium value will keep too much history for others. Lewis, William 于2025年1月18日

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-17 Thread Lewis, William
To clarify, the intention is that if max-snapshots-to-keep is set, then snapshots will be expired even if they are younger than max-snapshot-age-ms? Does this solve a problem that isn’t solved by setting max-snapshot-age-ms to a small value and setting min-snapshots-to-keep to your desired value

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-16 Thread Yufei Gu
It makes sense to have an option to control the max number of snapshots. Thanks Manu for the proposal. Yufei On Thu, Jan 16, 2025 at 7:46 PM Manu Zhang wrote: > Hi all, > > Do you have more comments on this feature? Do you have concerns about > adding a new field to SnapshotRef? > > Thanks, >

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-16 Thread Manu Zhang
Hi all, Do you have more comments on this feature? Do you have concerns about adding a new field to SnapshotRef? Thanks, Manu On Tue, Jan 7, 2025 at 2:37 PM Manu Zhang wrote: > Hi Ajantha, > > `history.expire.min-snapshots-to-keep` is the *minimum number of > snapshots* we can keep. I'm propos

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-06 Thread Manu Zhang
Hi Ajantha, `history.expire.min-snapshots-to-keep` is the *minimum number of snapshots* we can keep. I'm proposing to decide the *maximum number of snapshots* to keep by count rather than by age. Thanks, Manu On Tue, Jan 7, 2025 at 2:18 PM Ajantha Bhat wrote: > Hi Manu, > > We already have `re

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-06 Thread Ajantha Bhat
Hi Manu, We already have `retain_last` and `history.expire.min-snapshots-to-keep` to retain the snapshots based on count. Can you please elaborate on why can't we use the same? - Ajantha On Tue, Jan 7, 2025 at 11:33 AM Walaa Eldin Moustafa wrote: > Thanks Manu for starting this discussion. Tha

Re: [DISCUSS] Support keeping at most N snapshots

2025-01-06 Thread Walaa Eldin Moustafa
Thanks Manu for starting this discussion. That is definitely a valid feature. I have always found maintaining snapshots by day makes it harder to provide different types of guarantees/contracts especially when tables change rates are diverse or irregular. Maintaining by snapshot count makes a lot o

[DISCUSS] Support keeping at most N snapshots

2025-01-06 Thread Manu Zhang
Hi all, While maintaining Iceberg tables for our customers, I find it's difficult to set a default snapshot expiration time (`history.expire.max-snapshot-age-ms`) for different workloads. The default value of 5 days looks good for daily batch jobs but is too long for frequently-updated jobs. I'm