Re: Proposal for RESTful Data Operations

2023-12-11 Thread Ryan Blue
> Based on my understanding of the proposal, I think it's more about the possibility of enabling other ways that do not require a full rollback. it's just currently we implemented it as a rollback to prove the feasibility. My main question is this: what can be done besides rolling back a commit? A

Re: Proposal for RESTful Data Operations

2023-12-11 Thread Jack Ye
> The proposal is to roll back rewrite commits, but that's already possible with the much simpler API that exists today. Based on my understanding of the proposal, I think it's more about the possibility of enabling other ways that do not require a full rollback. it's just currently we implemented

Re: Proposal for REST APIs for Iceberg table scans

2023-12-11 Thread Jack Ye
Hi Ryan, thanks for the feedback! I was a part of this design discussion internally and can provide more details. One reason for separating the CreateScan operation was to make the API asynchronous and thus keep HTTP communications short. Consider the case where we only have GetScanTasks API, and

Re: RFC: Control flink upsert sink’s memory usage of insertedRowMap

2023-12-11 Thread Renjie Liu
Hi, OpenInx: Yes, I've read that pr. But if we want to maintain the out of memory map, I lean towards using flink's operator state, which has already provided high level api for us, including the type serialization, configuration, etc. It would be easier to maintain by reusing what flink has provi

Re: RFC: Control flink upsert sink’s memory usage of insertedRowMap

2023-12-11 Thread Renjie Liu
Hi, Ryan: Yes, but as I have stated in the doc, this may make simple ingestion jobs more difficult to maintain. What's your opinion about the second approach? On Mon, Dec 11, 2023 at 11:16 AM OpenInx wrote: > https://github.com/apache/iceberg/pull/2680/files > > On Mon, Dec 11, 2023 at 11:15 AM