Hi all,

  Here is the summary for *Iceberg Read Restrictions Sync (03/17)*.

  Recording: https://www.youtube.com/watch?v=LObBU3r_GXg
  Design Doc:
https://docs.google.com/document/d/1D0RcjmiYk0mKtCGak_MyG2dpyj6u19HQ/edit

  *Execution Order (agreed)*
  1. Authorization predicates and row filters on unmasked data
  2. Column masks applied
  3. User query filters on masked data

  This prevents point attacks where users craft filters to deduce masked
values. If user filters get pushed down before masking, that's the engine's
  responsibility to handle correctly and not let it be open to exploitation.

  *Masking Functions*
  - Mask to Default — type-specific constant values; preserves schema shape
for downstream BI tools. Essential for non-nullable columns where nulls
would break
  joins or engine operations.
  - Replace with Null — kept as a separate option for optional columns.
Serves a different policy intent than mask-to-default.
  - Alphanumeric Masking — preserves punctuation, redacts letters/numbers
with Xs.
  - Show First/Last Four — partial visibility for identifiers like SSNs.
For short strings (<4 chars), the team favored a "dumb mask, smarter admin"
approach -
  admins should pick appropriate masks rather than building complex
padding/hashing into the function itself.
  - Date Truncation — truncate to year or month (day/month replaced with
01). Standard and undisputed.
  - SHA-256 Hashing — two approaches discussed, both are needed:
    - Query-local (random salt): allows joins within a single query but not
across sessions.
    - Global stable hash: consistent across sessions for semantic layering.
    - We plan to continue discussing this in upcoming syncs.

  *Action Items*
  - I will research how Apache Ranger and Oracle handle short-string
masking before finalizing the spec
  - Check with the BigQuery team on why some of their masking behaviors are
the way they are (Thanks Talat)

  Looking forward to seeing you all in the next sync!

Best,
Prashant Singh

On Fri, Feb 6, 2026 at 5:35 PM Prashant Singh <[email protected]>
wrote:

> Thank you everyone for joining the call !
> Please find the recording attached [1]
> On a high level we discussed the following :
> - *Deny list vs allow list *:
> what does the client assume if a given column is not part of the required
> column projection, is it allowed to see that column or not.
> The consensus seemed to be having *DENY* as a representation, considering
> the allowlist can be huge for a very wide table. This does not dictate what
> catalog should be stored while defining its policy, some catalogs have both
> ALLOW and DENY.
>  Essentially what this DENY list means is what a client *should* expect
> when consuming Policy evaluation results.
> Note: *DENY* is generally not recommended since it can cause issues
> specially lets say a column being added and user getting access to it
> automatically but in this case since the policy evaluation results are
> coupled with the loadTable request.
> so we compute the *DENY* list considering the latest schema that was
> present at the time when it was loaded. Any new column being added to
> schema will create a new iceberg schema and clients will not have access to
> it.
> I will update the PR soon with this recommendation (request you all to
> please participate)
>
> - *Why Policy Evaluation over Policy Exchange* : we discussed this for a
> bit and touched why community has been considering this approach mostly due
> to multitude of policy definition / dialects out there and this is
> equivalent to vended creds which done based on the grants the users has and
> defines clear instructions in a portable way to be enforced cross engine
>
> - *Predefined masks over dynamic mask* : The spec is trying to have some
> set of predefined actions mostly inspired by Apache Ranger and there was a
> discussion / debate around it and there seemed to support for having both
> rather than choosing one of them specially for masks such as nullify / hash
> etc.
>
> - *Expression Expansion* : Iceberg expression to be more than predicates
> and its expansion to have UDF references (Iceberg UDF spec got ratified
> recently), Ryan said he will be taking a look into it soon (thank you so
> much !), we debated more on dialects etc additionally from UDF pov
>
> We plan to keep this discussion going. I see some new feedback on the spec
> PR [2] will address them and have them added to be discussed more !
>
> [1] https://www.youtube.com/watch?v=_wKszzNtP48
> [2] https://github.com/apache/iceberg/pull/13879#discussion_r2760180338
>
> Best,
> Prashant Singh
>
> On Mon, Feb 2, 2026 at 4:47 PM Prashant Singh <[email protected]>
> wrote:
>
>> Bumping the thread ^^
>>
>> Looking forward to seeing you all tomorrow
>> Meeting details: Tuesday, Feb 3⋅9:00 – 10:00am Pacific (recurring
>> biweekly): https://meet.google.com/gwy-jxos-jif
>>
>> I proactively added some comments in the agenda from the spec PR :
>> https://github.com/apache/iceberg/pull/13879
>>
>> Best,
>> Prashant Singh
>>
>> On Tue, Jan 20, 2026 at 1:58 PM Prashant Singh <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> Iceberg rest catalog returning policy evaluation results for fine
>>> grained access control enforcement, has been discussed a couple of times in
>>> the past as well as recently in the community, we pretty much have a
>>> broader agreement on what we wanna do at a higher level but there are still
>>> some open questions and details to hash out details for the spec to get
>>> ratified [1].
>>>
>>> I wanted to propose a dedicate sync for discussing these and closing
>>> them, the time slot, we got was (Thanks Steven) :
>>>
>>> *Biweekly starting from Feb 3 (9:00 am - 10:00 am PST),* you can see
>>> the same in your dev event  calendar if you subscribe to "Iceberg Dev
>>> Events".
>>>
>>> Please do join, we will keep the sync recorded and capture notes on the
>>> doc [2] for this sync.
>>>
>>> [1] https://github.com/apache/iceberg/pull/13879
>>> [2]
>>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps
>>>
>>> Best,
>>> Prashant Singh
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>

Reply via email to