Also iceberg catalog supports nested namespace, so maybe we need to
consider more general syntax for only database, table levels.

On Thu, Dec 7, 2023 at 5:17 AM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> I just think this is a bit more complicated than I want to take into the
> main library just because we have to make decisions about
>
> 1. Retries
> 2. Concurrency
> 3. Results/Error Reporting
>
> But if we have a good proposal for we will handle all those I think we
> could do it?
>
> On Dec 6, 2023, at 2:05 PM, Andrea Campolonghi <acampolon...@gmail.com>
> wrote:
>
> I think that if you call an expire snapshots function this is exactly what
> you want
>
> On Wed, Dec 6, 2023 at 18:47 Ryan Blue <b...@tabular.io> wrote:
>
>> My concern with the per-catalog approach is that people might
>> accidentally run it. Do you think it's clear enough that these invocations
>> will drop older snapshots?
>>
>> On Wed, Dec 6, 2023 at 2:40 AM Andrea Campolonghi <acampolon...@gmail.com>
>> wrote:
>>
>>> I like this approach. + 1
>>>
>>> On 6 Dec 2023, at 11:37, naveen <nk1...@gmail.com> wrote:
>>>
>>> Hi Everyone,
>>>
>>> Currently Spark-Procedures supports *expire_snapshots/remove_orphan_files
>>> *per table.
>>>
>>> Today, if someone has to run GCs on an entire catalog they will have to
>>> manually run these procedures for every table.
>>>
>>> Is it a good idea to do it in bulk as per catalog or with multiple
>>> tables ?
>>>
>>> Current syntax:
>>>
>>> CALL hive_prod.system.expire_snapshots(table => 'db.sample', <Options>)
>>>
>>> Proposed Syntax something similar:
>>>
>>> Per Namespace/Database
>>>
>>> CALL hive_prod.system.expire_snapshots(database => 'db', <Options>)
>>>
>>> Per Catalog
>>>
>>> CALL hive_prod.system.expire_snapshots(<Options>)
>>>
>>> Multiple Tables
>>>
>>> CALL hive_prod.system.expire_snapshots(tables => Array('db1.table1', 
>>> 'db2.table2), <Options>)
>>>
>>> PS: There could be exceptions for individual catalogs. Like Nessie
>>> doesn't support GC other than Nessie CLI. Hadoop can't list all the
>>> Namespaces.
>>>
>>>
>>> Regards,
>>> Naveen Kumar
>>>
>>>
>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>
>

Reply via email to