Thanks Jack and team for working on this proposal. I went over it and it is
very well written. I particularly like:

(1) The fact that it is adopting the SQL standard and adjusting some of its
semantics to fit the Iceberg model.

(2) It includes views from v1. Views are a very important tool for policy
enforcement. We have built a dynamic privacy and compliance enforcement
catalog extension at LinkedIn using views [1], and one of the main
improvements to that catalog extension would be securable view objects.
Admittedly, it might require further improvements to compute engines to
implement the permissions, but having an Iceberg spec would be the first
step.

Looking forward to the next steps of the proposal discussion and adoption.

[1]
https://www.slideshare.net/slideshow/viewshift-hassle-free-dynamic-policy-enforcement-for-every-data-lake/269577447

Thanks,
Walaa.


On Thu, May 30, 2024 at 10:35 PM Jack Ye <yezhao...@gmail.com> wrote:

> Hi everyone,
>
> Me and a few colleagues at AWS would like to discuss a new proposal for
> supporting securable objects in the Iceberg REST catalog spec.
>
> Here is our proposal in Google doc:
> https://docs.google.com/document/d/1KmIDbPuN6IYF0nWs9ostXIB9F4b8iH3zZO0hjgs1lm4/edit
>
> And here is the corresponding GitHub issue:
> https://github.com/apache/iceberg/issues/10407
>
> I will also paste the intro here for an overview. There are 2 main reasons
> for us to look into this area and draft this proposal:
>
> *IRC lacks privilege related concepts to express access decisions: *
>
> In a proposal we published previously regarding access decision, we would
> like to express the idea like, for example, a LoadTableResponse tells
> engine the list of privileges (e.g. read only, read and write, insert but
> no update, etc.) the caller has on the table, such that compute engines can
> enforce it accordingly.
>
> However, as we explored deeper into this topic, we found that there is no
> standard in Iceberg to express such privileges on table objects. And when
> we started to come up with keywords like SELECT, INSERT, DELETE, etc. to
> express such privileges, we realized that we were basically defining a
> securable object framework that is well-known in database systems (see
> Reference Works section for more details). This is the primary reason that
> led to us publishing this proposal before we push more progress on the work
> on access decisions.
>
> *IRC lacks clear guidelines on access management requirements:*
>
> This is feedback we heard frequently when interviewing AWS customers using
> Iceberg and considering building an IRC. Today Iceberg objects (namespaces,
> tables, views) are not securable within the Iceberg catalog itself, and
> need to be secured using an auxiliary system. This means that an
> organization building an IRC service needs to wrap many important
> operations into custom-built APIs for downstream users to consume (e.g. an
> API to grant Iceberg table access on S3 needs to grant corresponding IAM
> users/roles the right S3 policy or ACL setting). Huge amount of effort
> needs to be spent to figure out what are the missing APIs in IRC to satisfy
> enterprise level data warehouse access management requirements.
>
> There are some IRC products that offer vendor-specific APIs outside IRC to
> perform those operations, but this means that users are locked-in to this
> vendor’s securable object management system when using the IRC solution,
> and do not have the true freedom to easily switch to another solution if it
> offers better price-performance.
>
> We understand that Iceberg is not a security product, and it is not the
> best interest of the community to dive too deep into security-related
> domains. However, we believe that *we should at least offer the right
> interfaces and set the right standards for how Iceberg catalog expresses
> securable objects and how Iceberg catalog users interact with those objects*,
> such that (1) users that would like to build IRC can have a clear guideline
> of what API constract to implement for managing access to objects in IRC,
> and (2) users that are on one IRC product do not need to be locked-in due
> to access management aspects.
>
> Would really appreciate any feedback on this topic and proposal!
>
> Best,
> Jack Ye
>

Reply via email to