Re: Dedicated sync for Iceberg constraint support

huaxin gao Sat, 06 Jun 2026 19:37:46 -0700

Hi all,

Thanks everyone for attending the first constraint support sync. The
recording and Gemini notes are here
<https://docs.google.com/document/d/1re65fx3uqC7I_tJuS79IxLiB7HEN2Grt5qRIDjd3p-4/edit?tab=t.88xicz95ytym#heading=h.rqg3vluul6bb>
:


A quick summary of what we aligned on:

   - Scope: limit to single-table constraints — CHECK, UNIQUE, and PRIMARY
   KEY. FOREIGN KEY is out of scope (multi-table / catalog-level concern).
   - Remove rely from core metadata: it's an engine-level decision, not
   table state. It will be handled as a Spark-specific option instead.
   - validated vs. enforced: validated reflects the state of the whole
   table (the constraint holds as of a snapshot, via a full scan or
   incremental enforcement on top of a validated state); enforced means a
   writer verified its own write. We'll add these definitions to the doc.
   - CHECK constraints stay single-row (no cross-row dependencies) so they
   remain cheap to enforce.
   - A new "Behavior" section will be added to the doc to capture how
   constraints interact with tables and engines.

Since the sync, I've already added a couple of new sections to the proposal
to address the follow-ups:

   - CHECK expression limits — confirmed CHECK is single-row and
   deterministic (no cross-row/cross-table dependencies), matching the SQL
   standard and every major engine.
   - Enforcement feasibility by engine — a per-engine summary (Spark DSv2
   enforces CHECK today; Trino enforces at the engine/SPI level and would need
   Iceberg-connector wiring; Flink has no CHECK path and supports only NOT
   ENFORCED by design). Engines that can't enforce an enforced constraint
   should fail-fast rather than write unchecked data.


A note on the meeting link: the Google Meet link was updated to enable the
recording and Gemini notes capability. That unfortunately makes the link in
my original email (below) obsolete — sorry for the confusion. In the
future, the dev calendar is always the source of truth.

Thanks,
Huaxin


On Wed, May 27, 2026 at 3:58 PM huaxin gao <[email protected]> wrote:

> Hi all,
>
> I’ve been working on a proposal to add constraint support in Iceberg. The
> proposal covers persisting constraints in Iceberg metadata, binding them by
> field IDs for schema-evolution safety, and supporting CHECK constraints,
> with PRIMARY KEY and UNIQUE as informational metadata.
>
> I’d like to set up a dedicated sync to walk through the proposal and
> gather feedback.
>
> Proposal:
> https://docs.google.com/document/d/1re65fx3uqC7I_tJuS79IxLiB7HEN2Grt5qRIDjd3p-4/edit?tab=t.0#heading=h.o38ny2ndrd79
>
> I’ve scheduled a meeting every other Thursday at 9:00 AM PST, starting
> June 4.
>
> Meeting link: https://meet.google.com/kty-fdxd-aex?authuser=0
>
> Thanks,
> Huaxin
>

Re: Dedicated sync for Iceberg constraint support

Reply via email to