Hi all,

I’d like to propose adding *native incremental replication* to Iceberg
tables.

*Motivation:* Many production deployments require cross–data center backup
and data locality. Today this is usually handled by external services,
which adds operational overhead and introduces failure modes outside
Iceberg’s transactional boundary. Integrating replication into the commit
workflow would simplify operations and improve consistency.

*Proposal:* An optional replication phase in the commit process would
automatically copy data files and metadata to one or more targets (e.g.,
S3, HDFS, GCS, Azure). Replication is configured via table properties and
supports both synchronous (immediate consistency, higher latency) and
asynchronous (background retries, eventual consistency) modes. This
provides built-in disaster recovery, data locality optimization, and
cross-region analytics without external tool

Full draft proposal with design details is here:
👉 Incremental Iceberg Replication Proposal
<https://docs.google.com/document/d/1yrVLs0CQyIHs9WbBVx_EK6ad419Adsl9xHozpmQEMrs/edit?tab=t.0#heading=h.aa5ph23raz9l>

Thanks,
Xinli

Reply via email to