Great job and nice document @OpenInx! Thanks for sharing the progress!

I also did the PoC a couple of weeks ago, you can take a look the code here
<https://github.com/chenjunjiedada/incubator-iceberg/tree/row-level-delete>.
My approach is to use the additional meta columns (SRI)  and it is based on
the sequence number pull request #588
<https://github.com/apache/incubator-iceberg/pull/588>.  The main
differences from yours include:

   - base file write path: It hooks the internal row to add metadata for
   file name and row id.
   - delete file write path: It uses the spark to generate the deletion
   files via a staging table, and also sort the deletion file with file name.
   - read path: Beside the sequence number, it uses the low bound and upper
   bound to narrow down the deletion files.
   - base file + deletion file merge:  It uses filter API and also need
   merge sort optimization.

FYI, there is also an issue
<https://github.com/apache/incubator-iceberg/issues/825> about the
addtional meta column, it seems like spark will handle the additional
columns for iceberg so I didn't go further about that.

Besides the design doc, we still need to finalize more detail for merge on
read and I think that would be a good topic for next sync-up meeting.





On Sat, Mar 21, 2020 at 9:01 PM OpenInx <open...@gmail.com> wrote:

> Dear Iceberg Dev:
>
> As I said in the document[1] before,  we think the iceberg update/delete
> features (mainly merge-on-read) is the high
> priority feature (we've also discussed some flink+iceberg scenarios and
> anybody who interest that part can read
> the document).
>
> Recently, I write some demo to implement the merge-on-read thing( PoC).
> The pull request is here [2], I also provided
> a document to show the work [3].
>
> Any suggestion or feedback would be appreciated, Thanks.
>
> [1].
> https://docs.google.com/document/d/1I7FUPHyyvtZZ7zaTT1Lq14rNIEZFhzD41-fazVHEoIA/edit?usp=sharing
> [2]. https://github.com/openinx/incubator-iceberg/pull/5/files
> [3].
> https://docs.google.com/document/d/1CPFun2uG-eXdJggqKcPsTdNa2wPMpAdw8loeP-0fm_M/edit?usp=sharing
>
>

-- 
Best Regards

Reply via email to