Re: RFC: Control flink upsert sink’s memory usage of insertedRowMap

2023-12-08 Thread Ryan Blue
Thanks, Renjie! The option to use Flink's state tracking system seems like a good idea to me. On Thu, Dec 7, 2023 at 8:19 PM Renjie Liu wrote: > Hi: > I want to raise a discussion about controlling flink's upsert sink's > memory usage: > > https://toys-flash-4hl.craft.me/3VHrdWbV30QMk6 > > Welc

Proposal for REST APIs for Iceberg table scans

2023-12-08 Thread Chertara, Rahil
Hi all, My name is Rahil Chertara, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal for a new Scan API that will be utilized by the RESTCatalog. The process for table scan planning is currently done within client engines such as Apache Spark. By

Re: Community Meeting Minutes ?

2023-12-08 Thread Wing Yew Poon
Brian, Thanks for sending out the meeting minutes (the updated version looks good!). - Wing Yew On Thu, Dec 7, 2023 at 2:07 PM Brian Olsen wrote: > Hey Wing Yew, > > Sorry about this. I am just about to publish the last two. Me and the > other person that is responsible for these were hit by a

Isolation Analysis for Iceberg Multi-Table Transaction

2023-12-08 Thread Jack Ye
Hi everyone, I remember a while ago we had some discussions regarding the multi-table transaction API introduced in the REST spec at https://github.com/apache/iceberg/pull/6948#discussion_r1244026460. I recently did a more in-depth analysis, which can be viewed at: https://docs.google.com/documen

Proposal for RESTful Data operations

2023-12-08 Thread Drew
Hi everyone, My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal that introduces data commits as a part of the RESTCatalog. The current process for data commits lives on the client side, and by shifting this logic into the RES

Proposal for RESTful Data operations

2023-12-08 Thread Drew
Hi everyone, My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal that introduces data commits as a part of the RESTCatalog. The current process for data commits lives on the client side, and by shifting this logic into the RES

Apologies for multiple emails

2023-12-08 Thread Drew
Sorry for all the emails! I had an issue with sending the email out the other day with my proposal and it looks like the failed attempts ended up going through. Thank you, Drew

Re: Apologies for multiple emails

2023-12-08 Thread Pucheng Yang
You can reply to one of the emails to redirect people to the other to avoid discussion in two places. On Fri, Dec 8, 2023 at 2:53 PM Drew wrote: > Sorry for all the emails! I had an issue with sending the email out the > other day with my proposal and it looks like the failed attempts ended up >

RE: Proposal for RESTful Data Operations

2023-12-08 Thread Gallardo, Drew
In regards to the multiple emails sent earlier, please use this one for discussions. Thanks you! On 2023/12/07 00:47:42 Drew wrote: > Hi everyone, > > My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR > and Athena. I’m reaching out to share a proposal that introduces d

Re: Proposal for RESTful Data Operations

2023-12-08 Thread Ryan Blue
Thanks, Drew. I think it's a good idea in general to be able to perform commits on the server-side, but I would much rather break this down into smaller parts. I would definitely want to start with just file append use cases, since I think that is the biggest win. It can reduce retries and is an e