I think we should abstract the API firstly, then implement the MOR.
COW is also a necessary implementation, but it's easy to implement
and no so urgent.

On Tue, Mar 3, 2020 at 3:45 PM Junjie Chen <chenjunjied...@gmail.com> wrote:

> Thanks, Ryan
>
> Maybe the discussion is very clear before. Actually, we have built an
> internal implementation for update and delete via copy on write mode. Some
> others may also have their internal implementation as well. What I propose
> is to provide a general framework or APIs set that support both copy on
> write and merge on read, then people could share their COW implementation
> to community and prepare some job for MOR as well. For example, we could
> define row level update, mergeinto APIs and a table property indicates the
> underlying mode, then one could share implementation under the cow branch
> according to table property.
>
> There should have other ways to build the general framework, just want to
> know that do we want both COW and MOR implementation or just keep the MOR?
>
>
> On Tue, Mar 3, 2020 at 8:53 AM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
>> It should be possible to build an implementation of MERGE INTO in Spark
>> now, using the validation that Anton added in #351
>> <https://github.com/apache/incubator-iceberg/pull/351>. I think he can
>> provide some more context.
>>
>> On Wed, Feb 26, 2020 at 7:42 AM Junjie Chen <chenjunjied...@gmail.com>
>> wrote:
>>
>>> Hi devs
>>>
>>> We are working on row level delete milestone for upsert feature in merge
>>> on read mode. In the meantime, I think it may be useful to have a copy on
>>> write implementation. For example, we can implement upsert with spark, so
>>> that we can finalize the common APIs that upsert may need and also we could
>>> discover some capabilities that spark should provide. What do you think?
>>>
>>> --
>>> Best Regards
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>
>
> --
> Best Regards
>

Reply via email to