Ryan,

One use case is the user might need to time travel to a certain snapshot.
However, such a snapshot is expired due to the snapshot expiration
that only retains the latest snapshot operation, and this operation's only
intent is to remove the gc partition. It seems a little overkill to me.

I hope my explanation makes sense to you.

On Thu, Jun 1, 2023 at 3:39 PM Ryan Blue <b...@tabular.io> wrote:

> Pucheng,
>
> What is the use case around keeping the snapshot longer? We don't often
> have people ask to keep snapshots that can't be read, so it sounds like you
> might have something specific in mind?
>
> Ryan
>
> On Wed, May 31, 2023 at 8:19 PM Pucheng Yang <py...@pinterest.com.invalid>
> wrote:
>
>> Hi community,
>>
>> In my organization, a big portion of the datasets are partitioned by
>> date, normally we keep the latest X dates of partition for a given dataset.
>>
>> One issue that always bothers me is if I want to delete a partition
>> that should be GC, I will run SQL query "delete from tbl where dt = ..."
>> and do snapshot expiration to keep the latest snapshot to make sure that
>> partition data is physically removed. However, the downside of this
>> approach is the table snapshot history will be completely lost..
>>
>> I wonder if anyone else in the community has the same pain point? How do
>> you solve this? I would love to understand if there is a solution to this
>> otherwise we can brainstorm if there is a way to solve this.
>>
>> Thanks!
>>
>> Pucheng
>>
>
>
> --
> Ryan Blue
> Tabular
>

Reply via email to