Hi Filip,

CDC is one of the motivating use cases for row-level deletes. We're
building a CDC system and plan to use the update/delete/upsert features for
it.

In the meantime, we are using views and fact tables for the CDC use case.
Changes come in with an operation (delete, update, append) and a timestamp,
and we use a view to group by ID and use the latest copy of the row by
timestamp. That works well for reasonably sized datasets. And with Iceberg
tables, you can compact older data with a job that reads from the view at
some time and replaces the table's rows with the compacted view result.

rb

On Tue, Mar 10, 2020 at 6:50 AM Filip <filip....@gmail.com> wrote:

> Curious if there's any write-up/ thoughts/ considerations around CDC
> use-cases and how Iceberg relates to the topic.
> Has anyone considered CDC-related use-cases and how well did Iceberg prove
> for that? If so, are there any thoughts you'd consider sharing from
> attempting to leverage Iceberg for such use-cases?
> Oh and I also was thinking that CDC could be a valuable candidate to
> consider for the Update/Delete/Upsert spec/ implementation.
>
>
> --
> Filip Bocse
>


-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to