Re: upsert base on copy on write mode

Junjie Chen Mon, 02 Mar 2020 23:45:53 -0800

Thanks, Ryan

Maybe the discussion is very clear before. Actually, we have built an
internal implementation for update and delete via copy on write mode. Some
others may also have their internal implementation as well. What I propose
is to provide a general framework or APIs set that support both copy on
write and merge on read, then people could share their COW implementation
to community and prepare some job for MOR as well. For example, we could
define row level update, mergeinto APIs and a table property indicates the
underlying mode, then one could share implementation under the cow branch
according to table property.


There should have other ways to build the general framework, just want to
know that do we want both COW and MOR implementation or just keep the MOR?


On Tue, Mar 3, 2020 at 8:53 AM Ryan Blue <[email protected]> wrote:

> It should be possible to build an implementation of MERGE INTO in Spark
> now, using the validation that Anton added in #351
> <https://github.com/apache/incubator-iceberg/pull/351>. I think he can
> provide some more context.
>
> On Wed, Feb 26, 2020 at 7:42 AM Junjie Chen <[email protected]>
> wrote:
>
>> Hi devs
>>
>> We are working on row level delete milestone for upsert feature in merge
>> on read mode. In the meantime, I think it may be useful to have a copy on
>> write implementation. For example, we can implement upsert with spark, so
>> that we can finalize the common APIs that upsert may need and also we could
>> discover some capabilities that spark should provide. What do you think?
>>
>> --
>> Best Regards
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Best Regards

Re: upsert base on copy on write mode

Reply via email to