This is an automated email from the ASF dual-hosted git repository.

liuxun pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/gravitino.git


The following commit(s) were added to refs/heads/main by this push:
     new 58d397ff7 [#4587] docs: Refactor the security document and add 
authorization push-down page (#4624)
58d397ff7 is described below

commit 58d397ff78e0e413fda0cddb8ab0ef8090b7525a
Author: Xun <x...@datastrato.com>
AuthorDate: Thu Aug 22 19:56:45 2024 +0800

    [#4587] docs: Refactor the security document and add authorization 
push-down page (#4624)
    
    ### What changes were proposed in this pull request?
    
    1. Refactor the security document
    2. Add authorization push-down page
    
    ### Why are the changes needed?
    
    Fix: #4587
    
    ### Does this PR introduce _any_ user-facing change?
    
    N/A
    
    ### How was this patch tested?
    
    CI Passed
    
    ---------
    
    Co-authored-by: roryqi <ror...@apache.org>
    Co-authored-by: yuqi <y...@datastrato.com>
---
 docs/apache-hive-catalog.md                     |   2 +-
 docs/assets/security/authorization-pushdown.png | Bin 0 -> 256747 bytes
 docs/assets/{ => security}/object.png           | Bin
 docs/assets/{ => security}/privilege.png        | Bin
 docs/assets/{ => security}/role.png             | Bin
 docs/assets/{ => security}/user-group.png       | Bin
 docs/assets/{ => security}/workflow.png         | Bin
 docs/security/access-control.md                 |  87 +++++++++++++++++++-----
 docs/security/authorization-pushdown.md         |  48 +++++++++++++
 9 files changed, 119 insertions(+), 18 deletions(-)

diff --git a/docs/apache-hive-catalog.md b/docs/apache-hive-catalog.md
index 063108d9d..8dd6ed094 100644
--- a/docs/apache-hive-catalog.md
+++ b/docs/apache-hive-catalog.md
@@ -47,7 +47,7 @@ When you use the Gravitino with Trino. You can pass the Trino 
Hive connector con
 
 When you use the Gravitino with Spark. You can pass the Spark Hive connector 
configuration using prefix `spark.bypass.`. For example, using 
`spark.bypass.hive.exec.dynamic.partition.mode` to pass the 
`hive.exec.dynamic.partition.mode` to the Spark Hive connector in Spark runtime.
 
-
+When you use the Gravitino authorization Hive with Apache Ranger. You can see 
the [Authorization Hive with Ranger 
properties](security/authorization-pushdown.md#authorization-hive-with-ranger-properties)
 ### Catalog operations
 
 Refer to [Manage Relational Metadata Using 
Gravitino](./manage-relational-metadata-using-gravitino.md#catalog-operations) 
for more details.
diff --git a/docs/assets/security/authorization-pushdown.png 
b/docs/assets/security/authorization-pushdown.png
new file mode 100644
index 000000000..20ab25abe
Binary files /dev/null and b/docs/assets/security/authorization-pushdown.png 
differ
diff --git a/docs/assets/object.png b/docs/assets/security/object.png
similarity index 100%
rename from docs/assets/object.png
rename to docs/assets/security/object.png
diff --git a/docs/assets/privilege.png b/docs/assets/security/privilege.png
similarity index 100%
rename from docs/assets/privilege.png
rename to docs/assets/security/privilege.png
diff --git a/docs/assets/role.png b/docs/assets/security/role.png
similarity index 100%
rename from docs/assets/role.png
rename to docs/assets/security/role.png
diff --git a/docs/assets/user-group.png b/docs/assets/security/user-group.png
similarity index 100%
rename from docs/assets/user-group.png
rename to docs/assets/security/user-group.png
diff --git a/docs/assets/workflow.png b/docs/assets/security/workflow.png
similarity index 100%
rename from docs/assets/workflow.png
rename to docs/assets/security/workflow.png
diff --git a/docs/security/access-control.md b/docs/security/access-control.md
index 902fc0591..37b2714da 100644
--- a/docs/security/access-control.md
+++ b/docs/security/access-control.md
@@ -7,6 +7,28 @@ license: "This software is licensed under the Apache License 
version 2."
 
 ## Overview
 
+Apache Gravitino(incubating) is a technical data catalog that uses a unified 
metadata paradigm to manage multiple data sources while still allowing multiple 
engines like Spark, Trino, and Flink, or Python to connect to these data 
sources for data processing through Gravitino.
+
+Because each underlying data source will have its own access control system, 
it can be difficult to plug in data engines with the intent of querying 
multiple of these data at once.
+This is especially important for data governance practitioners who have to 
worry about data access restrictions and data compliance issues, but this is 
streamlined through Gravitino.
+Therefore, in the hopes of solving this big data issue, Gravitino plans to 
implement a universal set of privilege models and paradigms.
+With this, users will be able to manage all of their data sources on a single 
access plane, regardless of whether the data source is a database, or a message 
queue or an object storage system.
+
+After authorizing these data sources within Gravitino’s metadata lake, 
authentication can then be performed in Spark, Trino, and Flink Engines, as 
well as our Python client.
+This abstraction allows users to control access to data and make compliant use 
of the data without having to obstruct other teams and worrying about the 
tedious work of individual access control systems.
+
+### Gravitino Privilege Model
+
+Gravitino’s unified management model allows for each data source to have its 
own authorization features. However, each data source may come with its own 
dedicated authorization model and methods.
+We may not be able to properly set permissions to the underlying system, so 
when a given user tries to access this data, the underlying authorization 
system may result in permission inconsistencies and cause issues for external 
access.
+To mitigate this issue, Gravitino aims to provide a unified authorization 
model and accompanying methods that sit on top of all the data sources instead, 
making it much easier to manage access privileges.
+
+It is important to note that Gravitino’s authorization model will not merge 
the access control systems of the underlying data sources to form a large and 
unwieldy set of privileges.
+Instead, We will summarize the usage of the privileges currently in use within 
the data system, and offer a set of Gravitino-native privilege models that 
accurately reflect it.
+
+This is so that when users and data engines use Gravitino for data processing, 
this permission model is used to address the complexity of managing access 
control for different data sources.
+This set of permission models is meant to keep everything within the Gravitino 
system while still managing the access control settings of different data 
sources separately.
+
 Gravitino adopts RBAC and DAC. 
 
 Role-based Access Control (RBAC): Access privileges are assigned to roles, 
which are in turn assigned to users or groups.
@@ -20,24 +42,60 @@ Gravitino doesn't support metadata authentication. It means 
that Gravitino won't
 
 :::
 
-
 ## Concept
 
+### Authorization
+
+Gravitino also provides a set of authorization frameworks to interact with 
different underlying data source
+authorization systems (e.g., MySQL's own access control management and the 
Apache Ranger access control management system for big data)
+in accordance with its own authorization model and methodology.
+More information you can see the [Authorization push 
down](authorization-pushdown.md).
+
+### Authentication
+
+As mentioned above, Gravitino uses Ownership to control the privileges of 
securable object in the management category and uses Role to control access 
securable objects,
+so when a user performs a specific operation on a specified resource,
+Gravitino will perform a composite authentication on the Ownership and Role to 
which the securable object belongs.
+When a user has more than one Role, Gravitino will use the user's current Role 
for authentication, and the user can switch the current Role to access a 
different securable object.
+
 ### Role
 
-A metadata object to which privileges can be granted. Roles are in turn 
assigned to users or groups.
+The traditional access control system generally uses RBAC (Role-Based Access 
Control) for access control management,
+where each Role contains a collection of different operating privileges for a 
different securable object.
+When the system adds a new user or user group, you can select the Roles which 
they are expected to be granted to,
+so that the user can quickly start using it, without waiting for the 
administrator to gradually set up the access privileges to securable object for 
him.
+
+Roles also employ the concept of ownership – the owner of a Role is by default 
the creator of the Role,
+implying the owner has all the access control to operate the Role, including 
deleting the Role.
 
 ### Privilege
 
-A defined level of access to an object. Multiple distinct privileges may be 
used to control the granularity of access granted.
+Privilege is a specific operation method for securable object, if you need to 
control fine-grained privileges on a securable object in the system,
+then you need to design many different Privileges, however, too many 
Privileges will cause too complicated settings in the authorization.
 
-### User
+If you only need to carry out coarse-grained privilege control on the 
securable object in the system, then you only need to design a small number of 
Privileges,
+but it will result in too weak control ability when the authentication. 
Therefore, the design of Privilege is an important trade-off in the access 
control system.
+We know that Privilege is generally divided into two types, one is the 
management category of Privilege, such as the `CREATE`, `DELETE` resource 
privilege,
+and the other is the operation category of Privilege, such as the `READ` and 
`WRITE` resource privilege.
 
-A user identity recognized by Gravitino. External user system instead of 
Gravitino manages users. 
+In most organizations, the number of data managers is much smaller than the 
number of data users.
+Because it is the data users who need fine-grained privilege control,
+we must provide more Privileges related to usage and more tightly gatekeeper 
the administrative Privileges.
+To enforce this, we’ll introduce the concept of Ownership as a complete 
replacement for the administrative category of Privilege.
 
-### Group
+### Ownership
+
+When you create a securable object (Gravitino Service, Metalake, Catalog, and 
any other entity) in Gravitino, each entity has an Owner field that defines the 
user (or group) to which the resource belongs.
+The owner of each entity has implicit administrative class privilege, for 
example, to delete that securable object.
+Only the Owner of a securable object can fully manage that resource.
+If a securable object needs to be managed by more than one person at the same 
time, the owner is assigned to a user group.
 
-A group identity recognized by Gravitino. External user system instead of 
Gravitino manages groups. 
+### User
+Users are generally granted one or multiple Roles, and users have different 
operating privileges depending on their Role.
+
+### Group
+To make it easier to grant a single access control to multiple users, we can 
add users to a user group, and then grant one or multiple roles to that user 
group.
+This process allows all users belonging to that user group to have the access 
control in those roles.
 
 ### Metadata objects
 
@@ -55,17 +113,12 @@ The top container is the metalake.
 Catalogs are under the metalake. Catalogs represent different kinds of data 
sources.
 Schemas are under the catalog. There are tables, topics, or filesets under the 
schema.
 
-![object_image](../assets/object.png)
+![object_image](../assets/security/object.png)
 
 The relationship of the concepts is as below.
 
-![user_group_relationship_image](../assets/user-group.png)
-![concept_relationship_image](../assets/role.png)
-
-### Ownership
-
-Every metadata object has an owner. The owner could be a user or group, and 
has all the privileges of the metadata object.
-Meanwhile, you can transfer the ownership of securable object to another user 
or group.
+![user_group_relationship_image](../assets/security/user-group.png)
+![concept_relationship_image](../assets/security/role.png)
 
 ## The types of roles
 
@@ -180,7 +233,7 @@ If parent securable object has the same privilege name with 
different condition,
 For example, securable metalake object allows to use the catalog, but 
securable catalog denies to use the catalog, the user isn't able to use the 
catalog.
 If securable metalake object denies to use the catalog, but securable catalog 
allows to use the catalog, the user isn't able to use the catalog, too.
 
-![privilege_image](../assets/privilege.png)
+![privilege_image](../assets/security/privilege.png)
 
 ## Server Configuration
 
@@ -611,7 +664,7 @@ client.setOwner(table, "user1", "USER");
 
 You can follow the steps to achieve the authorization of Gravitino.
 
-![concept_workflow_image](../assets/workflow.png)
+![concept_workflow_image](../assets/security/workflow.png)
 
 1. Service admin configures the Gravitino server to enable authorization and 
creates a metalake.
 
diff --git a/docs/security/authorization-pushdown.md 
b/docs/security/authorization-pushdown.md
new file mode 100644
index 000000000..bab70144f
--- /dev/null
+++ b/docs/security/authorization-pushdown.md
@@ -0,0 +1,48 @@
+---
+title: "Authorization Push-down"
+slug: /security/authorization-push-down
+keyword: security
+license: "This software is licensed under the Apache License version 2."
+---
+
+## Authorization Push-down
+
+![authorization push down](../assets/security/authorization-pushdown.png)
+
+Gravitino offers a set of authorization frameworks that integrate with various 
underlying data source permission systems, such as MySQL's native permission 
management and Apache Ranger for big data. These frameworks align with 
Gravitino's own authorization model and methodology.
+Gravitino manages different data sources through Catalogs, and when a user 
performs an authorization operation on data within a Catalog, Gravitino invokes 
the Authorization Plugin module for that Catalog.
+This module translates Gravitino's authorization model into the permission 
rules of the underlying data source. The permissions are then enforced by the 
underlying permission system via the respective client, such as JDBC or the 
Apache Ranger client.
+
+### Authorization Hive with Ranger properties
+
+In order to use the Authorization Ranger Hive Plugin, you need to configure 
the following properties and [Apache Hive catalog 
properties](../apache-hive-catalog.md#catalog-properties):
+
+| Property Name                       | Description                            
                                                                                
                              | Default Value | Required | Since Version |
+|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
+| `authorization-provider`            | Providers to use to implement 
authorization plugin such as `ranger`.                                          
                                       | (none)        | No       | 0.6.0       
  |
+| `authorization.ranger.admin.url`    | The Apache Ranger web URIs.            
                                                                                
                              | (none)        | No       | 0.6.0         |
+| `authorization.ranger.auth.type`    | The Apache Ranger authentication type 
`simple` or `kerberos`.                                                         
                               | `simple`      | No       | 0.6.0         |
+| `authorization.ranger.username`     | The Apache Ranger admin web login 
username (auth type=simple), or kerberos principal(auth type=kerberos), Need 
have Ranger administrator permission. | (none)        | No       | 0.6.0        
 |
+| `authorization.ranger.password`     | The Apache Ranger admin web login user 
password (auth type=simple), or path of the keytab file(auth type=kerberos)     
                              | (none)        | No       | 0.6.0         |
+| `authorization.ranger.service.name` | The Apache Ranger service name.        
                                                                                
                              | (none)        | No       | 0.6.0         |
+
+Once you have used the correct configuration, you can perform authorization 
operations by calling Gravitino [authorization RESTful 
API](https://gravitino.apache.org/docs/latest/api/rest/grant-roles-to-a-user).
+
+#### Example of using the Authorization Ranger Hive Plugin
+
+Suppose you have an Apache Hive service in your datacenter and have created a 
`hiveRepo` in Apache Ranger to manage its permissions. 
+The Ranger service is accessible at `172.0.0.100:6080`, with the username 
`Jack` and the password `PWD123`. 
+To add this Hive service to Gravitino using the Hive catalog, you'll need to 
configure the following parameters.
+
+```properties
+authorization-provider=ranger
+authorization.ranger.admin.url=172.0.0.100:6080
+authorization.ranger.auth.type=simple
+authorization.ranger.username=Jack
+authorization.ranger.password=PWD123
+authorization.ranger.service.name=hiveRepo
+```
+
+:::caution
+Gravitino 0.6.0 only supports the authorization Apache Ranger Hive service and 
more data source authorization is under development.
+:::

Reply via email to