Thanks, Ryan Maybe the discussion is very clear before. Actually, we have built an internal implementation for update and delete via copy on write mode. Some others may also have their internal implementation as well. What I propose is to provide a general framework or APIs set that support both copy on write and merge on read, then people could share their COW implementation to community and prepare some job for MOR as well. For example, we could define row level update, mergeinto APIs and a table property indicates the underlying mode, then one could share implementation under the cow branch according to table property.
There should have other ways to build the general framework, just want to know that do we want both COW and MOR implementation or just keep the MOR? On Tue, Mar 3, 2020 at 8:53 AM Ryan Blue <rb...@netflix.com.invalid> wrote: > It should be possible to build an implementation of MERGE INTO in Spark > now, using the validation that Anton added in #351 > <https://github.com/apache/incubator-iceberg/pull/351>. I think he can > provide some more context. > > On Wed, Feb 26, 2020 at 7:42 AM Junjie Chen <chenjunjied...@gmail.com> > wrote: > >> Hi devs >> >> We are working on row level delete milestone for upsert feature in merge >> on read mode. In the meantime, I think it may be useful to have a copy on >> write implementation. For example, we can implement upsert with spark, so >> that we can finalize the common APIs that upsert may need and also we could >> discover some capabilities that spark should provide. What do you think? >> >> -- >> Best Regards >> > > > -- > Ryan Blue > Software Engineer > Netflix > -- Best Regards