Hi all,
Sharing a summary of the Iceberg Read Restrictions sync on May 12, 2026,
for
folks who couldn't attend. (As always, syncs are for discussion only.)
Recording: https://youtu.be/b9p6mI-k-0I
Topics discussed
1. NULL handling for mask-to-default
Question: should mask-to-default preserve NULL inputs (NULL → NULL) or
replace
them with the type-specific default (NULL → 0 / "" / epoch / etc.)? The
direction in the room was to NOT preserve NULL - preserving leaks the
existence of NULL, which can itself be sensitive information. The other
actions (replace-with-null, mask-alphanum, show-first-4 / show-last-4,
sha-256 variants, truncate-to-year / truncate-to-month) keep their natural
NULL-preserving behavior; mask-to-default is the exception. This is
reflected in the most recent push to PR #13879.
2. Older clients without read-restriction support
How should the spec handle clients that don't understand the
read-restrictions field returned by loadTable? Direction: introduce a
generic
client-capability header (X-Iceberg-Client-Capabilities) as a
forward-compat
signal, separate from per-request signals like
X-Iceberg-Access-Delegation.
Trust establishment between client and catalog stays out of scope -
operator/catalog-implementation concern, not spec.
A follow-up PR (#16394) has been raised to add the header and the
parameter
component; a separate [DISCUSS] thread is being raised in parallel.
3. Identity propagation (Trino, Spark, etc.)
Surfaced briefly. Folks acknowledged identity propagation across
multi-tenant
query engines is a real problem, but it's orthogonal to the spec - it's
a catalog / auth-manager concern. Not in scope for #13879.
4. Actions API placement
Continued discussion on where action definitions live in the Java API.
Direction: ship as utility functions in the api module, mirroring the
existing Transform pattern. Don't expose a top-level "Action" type in
public Java API — keep "Action" as a REST-spec construct only.
Follow-ups
- PR #13879 (FGAC read restrictions): incorporated the discussion above
and
continue iterating on reviewer feedback as they come (please take
another look).
- PR #16394 (X-Iceberg-Client-Capabilities header): raised, [DISCUSS]
thread incoming.
- Next sync: bi-weekly cadence - see the sync notes doc for date.
Links
- Recording: https://youtu.be/b9p6mI-k-0I
- Sync notes doc:
https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit
- PR #13879: https://github.com/apache/iceberg/pull/13879
- PR #16394: https://github.com/apache/iceberg/pull/16394
- DISCUSS:
https://lists.apache.org/thread/xlqx6k7g625p38bxxy141wt02d00w2h4
Thanks to everyone who joined.
Prashant
On Mon, Apr 27, 2026 at 9:05 AM Prashant Singh <[email protected]>
wrote:
> Thank you all for joining the syncs so far!
>
> After much discussion and debate, we've narrowed things down to a final
> list of 9 predefined actions.
>
> Spec update: I updated the spec PR [1] last week - bumping it here as
> well. Please take a look when you get a chance!
>
> POC progress: I've been prototyping with both a SQL client and a NoSQL
> client. The Apache Spark integration [2] fits cleanly. For NoSQL, I'm
> working on the Iceberg Generics reader [3] and py-iceberg in parallel.
>
> Upcoming sync - *04/28* agenda:
> 1. Default mask values per type (I've added some initial proposals to
> kick off the conversation)
> 2. How and where to put Actions in the iceberg java, where to keep it to
> make engine integration seamless and reusable (ofc engines are free to
> implement their own)
>
> Past sync notes and recordings are available here [4].
>
> Looking forward to seeing everyone at the next sync. Thank you for all
> your valuable feedback!
>
> Best,
> Prashant Singh
>
> [1] https://github.com/apache/iceberg/pull/13879
> [2] https://github.com/apache/iceberg/pull/16082
> [3] https://github.com/apache/iceberg/pull/16131
> [4]
> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps
>
> On Mon, Mar 23, 2026 at 7:22 PM Prashant Singh <[email protected]>
> wrote:
>
>> Hi all,
>>
>> Here is the summary for *Iceberg Read Restrictions Sync (03/17)*.
>>
>> Recording: https://www.youtube.com/watch?v=LObBU3r_GXg
>> Design Doc:
>> https://docs.google.com/document/d/1D0RcjmiYk0mKtCGak_MyG2dpyj6u19HQ/edit
>>
>> *Execution Order (agreed)*
>> 1. Authorization predicates and row filters on unmasked data
>> 2. Column masks applied
>> 3. User query filters on masked data
>>
>> This prevents point attacks where users craft filters to deduce masked
>> values. If user filters get pushed down before masking, that's the engine's
>> responsibility to handle correctly and not let it be open to
>> exploitation.
>>
>> *Masking Functions*
>> - Mask to Default — type-specific constant values; preserves schema
>> shape for downstream BI tools. Essential for non-nullable columns where
>> nulls would break
>> joins or engine operations.
>> - Replace with Null — kept as a separate option for optional columns.
>> Serves a different policy intent than mask-to-default.
>> - Alphanumeric Masking — preserves punctuation, redacts letters/numbers
>> with Xs.
>> - Show First/Last Four — partial visibility for identifiers like SSNs.
>> For short strings (<4 chars), the team favored a "dumb mask, smarter admin"
>> approach -
>> admins should pick appropriate masks rather than building complex
>> padding/hashing into the function itself.
>> - Date Truncation — truncate to year or month (day/month replaced with
>> 01). Standard and undisputed.
>> - SHA-256 Hashing — two approaches discussed, both are needed:
>> - Query-local (random salt): allows joins within a single query but
>> not across sessions.
>> - Global stable hash: consistent across sessions for semantic
>> layering.
>> - We plan to continue discussing this in upcoming syncs.
>>
>> *Action Items*
>> - I will research how Apache Ranger and Oracle handle short-string
>> masking before finalizing the spec
>> - Check with the BigQuery team on why some of their masking behaviors
>> are the way they are (Thanks Talat)
>>
>> Looking forward to seeing you all in the next sync!
>>
>> Best,
>> Prashant Singh
>>
>> On Fri, Feb 6, 2026 at 5:35 PM Prashant Singh <[email protected]>
>> wrote:
>>
>>> Thank you everyone for joining the call !
>>> Please find the recording attached [1]
>>> On a high level we discussed the following :
>>> - *Deny list vs allow list *:
>>> what does the client assume if a given column is not part of the
>>> required column projection, is it allowed to see that column or not.
>>> The consensus seemed to be having *DENY* as a representation,
>>> considering the allowlist can be huge for a very wide table. This does not
>>> dictate what catalog should be stored while defining its policy, some
>>> catalogs have both ALLOW and DENY.
>>> Essentially what this DENY list means is what a client *should* expect
>>> when consuming Policy evaluation results.
>>> Note: *DENY* is generally not recommended since it can cause issues
>>> specially lets say a column being added and user getting access to it
>>> automatically but in this case since the policy evaluation results are
>>> coupled with the loadTable request.
>>> so we compute the *DENY* list considering the latest schema that was
>>> present at the time when it was loaded. Any new column being added to
>>> schema will create a new iceberg schema and clients will not have access to
>>> it.
>>> I will update the PR soon with this recommendation (request you all to
>>> please participate)
>>>
>>> - *Why Policy Evaluation over Policy Exchange* : we discussed this for
>>> a bit and touched why community has been considering this approach mostly
>>> due to multitude of policy definition / dialects out there and this is
>>> equivalent to vended creds which done based on the grants the users has and
>>> defines clear instructions in a portable way to be enforced cross engine
>>>
>>> - *Predefined masks over dynamic mask* : The spec is trying to have
>>> some set of predefined actions mostly inspired by Apache Ranger and there
>>> was a discussion / debate around it and there seemed to support for having
>>> both rather than choosing one of them specially for masks such as nullify /
>>> hash etc.
>>>
>>> - *Expression Expansion* : Iceberg expression to be more than
>>> predicates and its expansion to have UDF references (Iceberg UDF spec got
>>> ratified recently), Ryan said he will be taking a look into it soon (thank
>>> you so much !), we debated more on dialects etc additionally from UDF pov
>>>
>>> We plan to keep this discussion going. I see some new feedback on the
>>> spec PR [2] will address them and have them added to be discussed more !
>>>
>>> [1] https://www.youtube.com/watch?v=_wKszzNtP48
>>> [2] https://github.com/apache/iceberg/pull/13879#discussion_r2760180338
>>>
>>> Best,
>>> Prashant Singh
>>>
>>> On Mon, Feb 2, 2026 at 4:47 PM Prashant Singh <[email protected]>
>>> wrote:
>>>
>>>> Bumping the thread ^^
>>>>
>>>> Looking forward to seeing you all tomorrow
>>>> Meeting details: Tuesday, Feb 3⋅9:00 – 10:00am Pacific (recurring
>>>> biweekly): https://meet.google.com/gwy-jxos-jif
>>>>
>>>> I proactively added some comments in the agenda from the spec PR :
>>>> https://github.com/apache/iceberg/pull/13879
>>>>
>>>> Best,
>>>> Prashant Singh
>>>>
>>>> On Tue, Jan 20, 2026 at 1:58 PM Prashant Singh <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Iceberg rest catalog returning policy evaluation results for fine
>>>>> grained access control enforcement, has been discussed a couple of times
>>>>> in
>>>>> the past as well as recently in the community, we pretty much have a
>>>>> broader agreement on what we wanna do at a higher level but there are
>>>>> still
>>>>> some open questions and details to hash out details for the spec to get
>>>>> ratified [1].
>>>>>
>>>>> I wanted to propose a dedicate sync for discussing these and closing
>>>>> them, the time slot, we got was (Thanks Steven) :
>>>>>
>>>>> *Biweekly starting from Feb 3 (9:00 am - 10:00 am PST),* you can see
>>>>> the same in your dev event calendar if you subscribe to "Iceberg Dev
>>>>> Events".
>>>>>
>>>>> Please do join, we will keep the sync recorded and capture notes on
>>>>> the doc [2] for this sync.
>>>>>
>>>>> [1] https://github.com/apache/iceberg/pull/13879
>>>>> [2]
>>>>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps
>>>>>
>>>>> Best,
>>>>> Prashant Singh
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>