Hi all, Quick summary from yesterday's community sync for those who couldn't attend. Recording: https://youtu.be/-KEesN1udyY 1. *Client Capabilities Header (PR #16394) <https://github.com/apache/iceberg/pull/16394>* Re-discussed the generic vs. per-feature header debate from the 05/12 sync. Rough consensus leaned toward a single generic X-Iceberg-Client-Capabilities header, with these clarifications: - *Advisory, not authoritative*. Servers MUST NOT use the header for trust or authorization decisions. Trust is established out-of-band (mTLS / OAuth / engine identity). - *No versioning at the header level yet*. The fail-closed contract ("client MUST fail on unrecognized payload contents") handles forward compatibility. New incompatible behavior would be a new capability token, not a version suffix. - *No change to existing per-request directives*. X-Iceberg-Access-Delegation stays as-is for vended-credentials vs. remote-signing selection. Capabilities and per-request preferences are kept as separate mechanisms.
Dan was absent for the final stretch, following up separately on the list before revising the PR. 2. *Expression Spec Enhancement* ID-based field references (needed for unbounded row filter expressions in read restrictions) will move forward. Spec write-up to follow. In the meantime, the read-restrictions spec PR will reference the planned change. 3. *Read Restrictions Spec - Call for Review* The spec PR (#13879) <https://github.com/apache/iceberg/pull/13879> is ready for another review pass. It will be updated with the expression spec reference once that write-up lands, but the rest of the content is in shape for review now. Would appreciate eyes on it. 4. *API for Action functions (PR#16198 <https://github.com/apache/iceberg/pull/16198>) - Call for Review* Generic functions module + actions wrapper is in the PR with end-to-end plumbing. Thanks, Prashant On Mon, May 18, 2026 at 12:01 PM Prashant Singh <[email protected]> wrote: > Hi all, > > Sharing a summary of the Iceberg Read Restrictions sync on May 12, 2026, > for > folks who couldn't attend. (As always, syncs are for discussion only.) > Recording: https://youtu.be/b9p6mI-k-0I > > Topics discussed > > 1. NULL handling for mask-to-default > > Question: should mask-to-default preserve NULL inputs (NULL → NULL) or > replace > them with the type-specific default (NULL → 0 / "" / epoch / etc.)? The > direction in the room was to NOT preserve NULL - preserving leaks the > existence of NULL, which can itself be sensitive information. The other > actions (replace-with-null, mask-alphanum, show-first-4 / show-last-4, > sha-256 variants, truncate-to-year / truncate-to-month) keep their > natural > NULL-preserving behavior; mask-to-default is the exception. This is > reflected in the most recent push to PR #13879. > > 2. Older clients without read-restriction support > > How should the spec handle clients that don't understand the > read-restrictions field returned by loadTable? Direction: introduce a > generic > client-capability header (X-Iceberg-Client-Capabilities) as a > forward-compat > signal, separate from per-request signals like > X-Iceberg-Access-Delegation. > Trust establishment between client and catalog stays out of scope - > operator/catalog-implementation concern, not spec. > > A follow-up PR (#16394) has been raised to add the header and the > parameter > component; a separate [DISCUSS] thread is being raised in parallel. > > 3. Identity propagation (Trino, Spark, etc.) > > Surfaced briefly. Folks acknowledged identity propagation across > multi-tenant > query engines is a real problem, but it's orthogonal to the spec - it's > a catalog / auth-manager concern. Not in scope for #13879. > > 4. Actions API placement > > Continued discussion on where action definitions live in the Java API. > Direction: ship as utility functions in the api module, mirroring the > existing Transform pattern. Don't expose a top-level "Action" type in > public Java API — keep "Action" as a REST-spec construct only. > > > Follow-ups > > - PR #13879 (FGAC read restrictions): incorporated the discussion > above and > continue iterating on reviewer feedback as they come (please take > another look). > - PR #16394 (X-Iceberg-Client-Capabilities header): raised, [DISCUSS] > thread incoming. > - Next sync: bi-weekly cadence - see the sync notes doc for date. > > Links > > - Recording: https://youtu.be/b9p6mI-k-0I > - Sync notes doc: > https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit > - PR #13879: https://github.com/apache/iceberg/pull/13879 > - PR #16394: https://github.com/apache/iceberg/pull/16394 > - DISCUSS: > https://lists.apache.org/thread/xlqx6k7g625p38bxxy141wt02d00w2h4 > > Thanks to everyone who joined. > > Prashant > > On Mon, Apr 27, 2026 at 9:05 AM Prashant Singh <[email protected]> > wrote: > >> Thank you all for joining the syncs so far! >> >> After much discussion and debate, we've narrowed things down to a final >> list of 9 predefined actions. >> >> Spec update: I updated the spec PR [1] last week - bumping it here as >> well. Please take a look when you get a chance! >> >> POC progress: I've been prototyping with both a SQL client and a NoSQL >> client. The Apache Spark integration [2] fits cleanly. For NoSQL, I'm >> working on the Iceberg Generics reader [3] and py-iceberg in parallel. >> >> Upcoming sync - *04/28* agenda: >> 1. Default mask values per type (I've added some initial proposals to >> kick off the conversation) >> 2. How and where to put Actions in the iceberg java, where to keep it >> to make engine integration seamless and reusable (ofc engines are free to >> implement their own) >> >> Past sync notes and recordings are available here [4]. >> >> Looking forward to seeing everyone at the next sync. Thank you for all >> your valuable feedback! >> >> Best, >> Prashant Singh >> >> [1] https://github.com/apache/iceberg/pull/13879 >> [2] https://github.com/apache/iceberg/pull/16082 >> [3] https://github.com/apache/iceberg/pull/16131 >> [4] >> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps >> >> On Mon, Mar 23, 2026 at 7:22 PM Prashant Singh <[email protected]> >> wrote: >> >>> Hi all, >>> >>> Here is the summary for *Iceberg Read Restrictions Sync (03/17)*. >>> >>> Recording: https://www.youtube.com/watch?v=LObBU3r_GXg >>> Design Doc: >>> https://docs.google.com/document/d/1D0RcjmiYk0mKtCGak_MyG2dpyj6u19HQ/edit >>> >>> *Execution Order (agreed)* >>> 1. Authorization predicates and row filters on unmasked data >>> 2. Column masks applied >>> 3. User query filters on masked data >>> >>> This prevents point attacks where users craft filters to deduce masked >>> values. If user filters get pushed down before masking, that's the engine's >>> responsibility to handle correctly and not let it be open to >>> exploitation. >>> >>> *Masking Functions* >>> - Mask to Default — type-specific constant values; preserves schema >>> shape for downstream BI tools. Essential for non-nullable columns where >>> nulls would break >>> joins or engine operations. >>> - Replace with Null — kept as a separate option for optional columns. >>> Serves a different policy intent than mask-to-default. >>> - Alphanumeric Masking — preserves punctuation, redacts >>> letters/numbers with Xs. >>> - Show First/Last Four — partial visibility for identifiers like SSNs. >>> For short strings (<4 chars), the team favored a "dumb mask, smarter admin" >>> approach - >>> admins should pick appropriate masks rather than building complex >>> padding/hashing into the function itself. >>> - Date Truncation — truncate to year or month (day/month replaced with >>> 01). Standard and undisputed. >>> - SHA-256 Hashing — two approaches discussed, both are needed: >>> - Query-local (random salt): allows joins within a single query but >>> not across sessions. >>> - Global stable hash: consistent across sessions for semantic >>> layering. >>> - We plan to continue discussing this in upcoming syncs. >>> >>> *Action Items* >>> - I will research how Apache Ranger and Oracle handle short-string >>> masking before finalizing the spec >>> - Check with the BigQuery team on why some of their masking behaviors >>> are the way they are (Thanks Talat) >>> >>> Looking forward to seeing you all in the next sync! >>> >>> Best, >>> Prashant Singh >>> >>> On Fri, Feb 6, 2026 at 5:35 PM Prashant Singh <[email protected]> >>> wrote: >>> >>>> Thank you everyone for joining the call ! >>>> Please find the recording attached [1] >>>> On a high level we discussed the following : >>>> - *Deny list vs allow list *: >>>> what does the client assume if a given column is not part of the >>>> required column projection, is it allowed to see that column or not. >>>> The consensus seemed to be having *DENY* as a representation, >>>> considering the allowlist can be huge for a very wide table. This does not >>>> dictate what catalog should be stored while defining its policy, some >>>> catalogs have both ALLOW and DENY. >>>> Essentially what this DENY list means is what a client *should* >>>> expect when consuming Policy evaluation results. >>>> Note: *DENY* is generally not recommended since it can cause issues >>>> specially lets say a column being added and user getting access to it >>>> automatically but in this case since the policy evaluation results are >>>> coupled with the loadTable request. >>>> so we compute the *DENY* list considering the latest schema that was >>>> present at the time when it was loaded. Any new column being added to >>>> schema will create a new iceberg schema and clients will not have access to >>>> it. >>>> I will update the PR soon with this recommendation (request you all to >>>> please participate) >>>> >>>> - *Why Policy Evaluation over Policy Exchange* : we discussed this for >>>> a bit and touched why community has been considering this approach mostly >>>> due to multitude of policy definition / dialects out there and this is >>>> equivalent to vended creds which done based on the grants the users has and >>>> defines clear instructions in a portable way to be enforced cross >>>> engine >>>> >>>> - *Predefined masks over dynamic mask* : The spec is trying to have >>>> some set of predefined actions mostly inspired by Apache Ranger and there >>>> was a discussion / debate around it and there seemed to support for having >>>> both rather than choosing one of them specially for masks such as nullify / >>>> hash etc. >>>> >>>> - *Expression Expansion* : Iceberg expression to be more than >>>> predicates and its expansion to have UDF references (Iceberg UDF spec got >>>> ratified recently), Ryan said he will be taking a look into it soon (thank >>>> you so much !), we debated more on dialects etc additionally from UDF pov >>>> >>>> We plan to keep this discussion going. I see some new feedback on the >>>> spec PR [2] will address them and have them added to be discussed more ! >>>> >>>> [1] https://www.youtube.com/watch?v=_wKszzNtP48 >>>> [2] https://github.com/apache/iceberg/pull/13879#discussion_r2760180338 >>>> >>>> Best, >>>> Prashant Singh >>>> >>>> On Mon, Feb 2, 2026 at 4:47 PM Prashant Singh <[email protected]> >>>> wrote: >>>> >>>>> Bumping the thread ^^ >>>>> >>>>> Looking forward to seeing you all tomorrow >>>>> Meeting details: Tuesday, Feb 3⋅9:00 – 10:00am Pacific (recurring >>>>> biweekly): https://meet.google.com/gwy-jxos-jif >>>>> >>>>> I proactively added some comments in the agenda from the spec PR : >>>>> https://github.com/apache/iceberg/pull/13879 >>>>> >>>>> Best, >>>>> Prashant Singh >>>>> >>>>> On Tue, Jan 20, 2026 at 1:58 PM Prashant Singh < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Iceberg rest catalog returning policy evaluation results for fine >>>>>> grained access control enforcement, has been discussed a couple of times >>>>>> in >>>>>> the past as well as recently in the community, we pretty much have a >>>>>> broader agreement on what we wanna do at a higher level but there are >>>>>> still >>>>>> some open questions and details to hash out details for the spec to get >>>>>> ratified [1]. >>>>>> >>>>>> I wanted to propose a dedicate sync for discussing these and closing >>>>>> them, the time slot, we got was (Thanks Steven) : >>>>>> >>>>>> *Biweekly starting from Feb 3 (9:00 am - 10:00 am PST),* you can see >>>>>> the same in your dev event calendar if you subscribe to "Iceberg Dev >>>>>> Events". >>>>>> >>>>>> Please do join, we will keep the sync recorded and capture notes on >>>>>> the doc [2] for this sync. >>>>>> >>>>>> [1] https://github.com/apache/iceberg/pull/13879 >>>>>> [2] >>>>>> https://docs.google.com/document/d/1iGNydKY7XT1N5Nz056vDPM0P8v0MFymGqNtOlUGUp-c/edit?tab=t.0#heading=h.tevndn85fps >>>>>> >>>>>> Best, >>>>>> Prashant Singh >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>
