Re: [DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread Naveen Kumar
> > My concern with the per-catalog approach is that people might accidentally > run it. Do you think it's clear enough that these invocations will drop > older snapshots? > As @Andrea has mentioned, the existing implementation of expire snapshots on every table should help. I just think this is a

Re: Why manifest rewrite only touches files that have latest spec id?

2023-12-06 Thread Pucheng Yang
Based on what I understand, it seems there is no particular reason, and seems like a feature to be added on. On Wed, Dec 6, 2023 at 8:06 PM Pucheng Yang wrote: > Hi community, > > May I know why manifest rewrite will only touch files that have the latest > spec id? What will be the suggestion if

Re: Community Meeting Minutes ?

2023-12-06 Thread Ajantha Bhat
+1, We need to improve on this. On Thu, Dec 7, 2023 at 2:36 AM Wing Yew Poon wrote: > The meeting minutes and a link to the recording used to be sent out to > this list regularly soon after the community sync. I have not been able to > attend the sync recently and I haven't seen the minutes for

Why manifest rewrite only touches files that have latest spec id?

2023-12-06 Thread Pucheng Yang
Hi community, May I know why manifest rewrite will only touch files that have the latest spec id? What will be the suggestion if we want to rewrite manifest files that belong to non current spec id? Manifest selection logic: https://github.com/apache/iceberg/blob/6a9d3c77977baff4295ee2dde0150d73c

Re: [DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread Renjie Liu
Also iceberg catalog supports nested namespace, so maybe we need to consider more general syntax for only database, table levels. On Thu, Dec 7, 2023 at 5:17 AM Russell Spitzer wrote: > I just think this is a bit more complicated than I want to take into the > main library just because we have t

Proposal for RESTful Data Operations

2023-12-06 Thread Drew
Hi everyone, My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal that introduces data commits as a part of the RESTCatalog. The current process for data commits lives on the client side, and by shifting this logic into the RES

Re: [DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread Russell Spitzer
I just think this is a bit more complicated than I want to take into the main library just because we have to make decisions about 1. Retries 2. Concurrency 3. Results/Error Reporting But if we have a good proposal for we will handle all those I think we could do it? > On Dec 6, 2023, at 2:05

Re: Community Meeting Minutes ?

2023-12-06 Thread Wing Yew Poon
The meeting minutes and a link to the recording used to be sent out to this list regularly soon after the community sync. I have not been able to attend the sync recently and I haven't seen the minutes for the last two syncs. Can we please maintain the practice of sending the minutes and recording

Re: [DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread Andrea Campolonghi
I think that if you call an expire snapshots function this is exactly what you want On Wed, Dec 6, 2023 at 18:47 Ryan Blue wrote: > My concern with the per-catalog approach is that people might accidentally > run it. Do you think it's clear enough that these invocations will drop > older snapsho

Re: Invitation to contribute to OneTable

2023-12-06 Thread Jack Ye
Sounds good! -Jack On Wed, Dec 6, 2023 at 10:16 AM Tim Brown wrote: > Hi Ryan, > > Apologies for the noise. > > Jack and Walaa, let's move any conversations to the Discussions > board on github for > the project. Also feel free to reach out

Re: Invitation to contribute to OneTable

2023-12-06 Thread Tim Brown
Hi Ryan, Apologies for the noise. Jack and Walaa, let's move any conversations to the Discussions board on github for the project. Also feel free to reach out to me directly if you prefer. I'm personally looking forward to learning more about

Re: Invitation to contribute to OneTable

2023-12-06 Thread Ryan Blue
I'm not sure that this is the right place for a discussion about the merits of their approach. This list is for Iceberg development. I encourage anyone interested to follow up on the appropriate incubator list rather than here. I also think it's debatable whether advertising other projects is hel

Re: [DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread Ryan Blue
My concern with the per-catalog approach is that people might accidentally run it. Do you think it's clear enough that these invocations will drop older snapshots? On Wed, Dec 6, 2023 at 2:40 AM Andrea Campolonghi wrote: > I like this approach. + 1 > > On 6 Dec 2023, at 11:37, naveen wrote: > >

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Ryan Blue
+1 On Wed, Dec 6, 2023 at 9:05 AM Russell Spitzer wrote: > Ah got it! For some reason I kept looking for a circle, but in the link > you sent I can see the obvious polygon that is missing. > > I'm +1 on switching the image to the one offered by Tabular > > On Dec 6, 2023, at 10:01 AM, Brian Olse

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Russell Spitzer
Ah got it! For some reason I kept looking for a circle, but in the link you sent I can see the obvious polygon that is missing. I'm +1 on switching the image to the one offered by Tabular > On Dec 6, 2023, at 10:01 AM, Brian Olsen wrote: > > Thanks Weston and Russell, > > To see the old file

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Weston Pace
Old: https://drive.google.com/file/d/1-XnKkXVQRufItgFhQYy9ULjjESPoneol/view?usp=drivesdk New: https://drive.google.com/file/d/1-gekPCy_c06dHVGUqNg1fq8VsS6Ycjos/view?usp=drivesdk On Wed, Dec 6, 2023, 8:01 AM Brian Olsen wrote: > Thanks Weston and Russell, > > To see the old file, look at the Wik

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Brian Olsen
Thanks Weston and Russell, To see the old file, look at the Wikimedia Commons image: https://en.wikipedia.org/wiki/Apache_Iceberg#/media/File:Apache_Iceberg_Logo.svg. You'll notice the transparent background reveals a triangular hole. You can also see this in the Apache store on RedBubble when lo

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Weston Pace
BTW: ASF mailing lists strip attachments and so you will need to use a gist or some other sharing. On Wed, Dec 6, 2023, 7:22 AM Russell Spitzer wrote: > The original email has a broken png link so I was never able to see the > issue, could you attach the before and after so I can see the differe

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Russell Spitzer
The original email has a broken png link so I was never able to see the issue, could you attach the before and after so I can see the difference? > On Dec 6, 2023, at 9:07 AM, Brian Olsen wrote: > > Hey all, > > I wanted to resurface this and see if any PMC could take a look. Thanks! > > On W

Re: Iceberg Logo Fix and Iceberg Swag Shop

2023-12-06 Thread Brian Olsen
Hey all, I wanted to resurface this and see if any PMC could take a look. Thanks! On Wed, Nov 1, 2023 at 8:37 AM Jean-Baptiste Onofré wrote: > Hi Brian, > > Good catch. > > We need to get approval from the PMC, and notify ASF VP Brand Management > (Mark Thomas) by sending a message to tradema..

Re: [DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread Andrea Campolonghi
I like this approach. + 1 > On 6 Dec 2023, at 11:37, naveen wrote: > > Hi Everyone, > > Currently Spark-Procedures supports expire_snapshots/remove_orphan_files per > table. > > Today, if someone has to run GCs on an entire catalog they will have to > manually run these procedures for every

[DISCUSS] Run GC with Catalog or Tables

2023-12-06 Thread naveen
Hi Everyone, Currently Spark-Procedures supports *expire_snapshots/remove_orphan_files *per table. Today, if someone has to run GCs on an entire catalog they will have to manually run these procedures for every table. Is it a good idea to do it in bulk as per catalog or with multiple tables ? C