justinmclean commented on code in PR #6849:
URL: https://github.com/apache/gravitino/pull/6849#discussion_r2034330366


##########
docs/admin/iceberg-server.md:
##########
@@ -0,0 +1,1351 @@
+---
+title: Iceberg REST catalog service
+slug: /iceberg-rest-service
+keywords:
+  - Iceberg REST catalog
+license: "This software is licensed under the Apache License version 2."
+---
+
+## Background
+
+The Apache Gravitino Iceberg REST Server follows the
+[Apache Iceberg REST API 
specification](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml)
+and acts as an Iceberg REST catalog server,
+you could access the Iceberg REST endpoint at `http://$ip:$port/iceberg/`.
+
+### Capabilities
+
+- Supports the Apache Iceberg REST API defined in Iceberg 1.5, and supports 
all namespace and table interfaces.
+  The following interfaces are not implemented yet:
+  - multi-table transaction
+  - pagination
+- Works as a catalog proxy, supporting `Hive` and `JDBC` as catalog backend.
+- Supports credential vending for `S3`、`GCS`、`OSS` and `ADLS`.
+- Supports different storages like `S3`, `HDFS`, `OSS`, `GCS`, `ADLS`.
+- Capable of supporting other storages.
+- Supports event listener.
+- Supports Audit log.
+- Supports OAuth2 and HTTPS.
+- Provides a pluggable metrics store interface to store and delete Iceberg 
metrics.
+
+## Server management
+
+There are three deployment scenarios for Gravitino Iceberg REST server:
+
+- A standalone server in a standalone Gravitino Iceberg REST server package, 
the CLASSPATH is `libs`.
+- A standalone server in the Gravitino server package, the CLASSPATH is 
`iceberg-rest-server/libs`.
+- An auxiliary service embedded in the Gravitino server, the CLASSPATH is 
`iceberg-rest-server/libs`.
+
+For detailed instructions on how to build and install the Gravitino server 
package,
+please refer to [the build guide](../develop/how-to-build.md) and [the 
installation guide](../install/install.md).
+To build the Gravitino Iceberg REST server package, use the command `./gradlew 
compileIcebergRESTServer -x test`.
+Alternatively, to create the corresponding compressed package in the 
distribution directory,
+use `./gradlew assembleIcebergRESTServer -x test`.
+The Gravitino Iceberg REST server package includes the following files:
+
+```text
+├─ ...
+└─ distribution/gravitino-iceberg-rest-server
+    ├─ bin/
+    │  └─ gravitino-iceberg-rest-server.sh    # Launching scripts.
+    ├─ conf/                                   # All configurations.
+    │  ├─ gravitino-iceberg-rest-server.conf  # Server configuration.
+    │  ├─ gravitino-env.sh                    # Environment variables, e.g. 
JAVA_HOME, GRAVITINO_HOME, etc.
+    │  ├─ log4j2.properties                   # log4j configurations.
+    │  └─ hdfs-site.xml & core-site.xml       # HDFS configuration files.
+    ├─ libs/                                   # Dependencies libraries.
+    └─ logs/                                   # Logs directory. Auto-created 
after the server starts.
+```
+
+## Server configuration
+
+There are distinct configuration files for standalone and auxiliary server:
+
+- `gravitino-iceberg-rest-server.conf` is used for the standalone server;
+- `gravitino.conf` is for the auxiliary server.
+
+Although the configuration files differ, the configuration items remain the 
same.
+
+Starting with version `0.6.0-incubating`, the prefix 
`gravitino.auxService.iceberg-rest.`
+for auxiliary server configurations has been deprecated.
+If both `gravitino.auxService.iceberg-rest.key` and 
`gravitino.iceberg-rest.key` are present,
+the latter will take precedence.
+The configurations listed below use the `gravitino.iceberg-rest.` prefix.
+
+### Configuration to enable Iceberg REST service in Gravitino server.
+
+<table>
+<thead>
+<tr>
+  <th>Configuration item</th>
+  <th>Description</th>
+  <th>Default value</th>
+  <th>Required</th>
+  <th>Since Version</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+  <td><tt>gravitino.auxService.names</tt></td>
+  <td>
+    The auxiliary service name of the Gravitino Iceberg REST catalog service.
+    Use `iceberg-rest`.
+  </td>
+  <td>(none)</td>
+  <td>Yes</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.classpath</tt></td>
+  <td>
+    The CLASSPATH of the Gravitino Iceberg REST catalog service,
+    including the directory containing JARs and configuration.
+    It supports both absolute and relative paths.
+    For example, `iceberg-rest-server/libs,iceberg-rest-server/conf`.
+  </td>
+  <td>(none)</td>
+  <td>Yes</td>
+  <td>`0.2.0`</td>
+</tr>
+</tbody>
+</table>
+
+:::note
+These configurations only are only effective in `gravitino.conf`.
+You don't need to specify them if the Iceberg server is started
+as a standalone server.
+:::
+
+### HTTP server configuration
+
+<table>
+<thead>
+<tr>
+  <th>Configuration item</th>
+  <th>Description</th>
+  <th>Default value</th>
+  <th>Required</th>
+  <th>Since Version</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+  <td><tt>gravitino.iceberg-rest.host</tt></td>
+  <td>The host of the Gravitino Iceberg REST catalog service.</td>
+  <td>`0.0.0.0`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.httpPort</tt></td>
+  <td>The port of the Gravitino Iceberg REST catalog service.</td>
+  <td>`9001`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.minThreads</tt></td>
+  <td>
+    The minimum number of threads in the thread pool used by the Jetty Web 
server.
+    `minThreads` is 8 if the value is less than 8.
+  </td>
+  <td>`Math.max(Math.min(Runtime.getRuntime().availableProcessors() * 2, 100), 
8)`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.maxThreads</tt></td>
+  <td>
+    The maximum number of threads in the thread pool used by the Jetty Web 
server.
+    `maxThreads` is 8 if the value is less than 8, and the value must be 
greater than or equal to `minThreads`.
+  </td>
+  <td>`Math.max(Runtime.getRuntime().availableProcessors() * 4, 400)`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.threadPoolWorkQueueSize</tt></td>
+  <td>
+    The size of the queue in the thread pool used by Gravitino Iceberg REST 
catalog service.
+  </td>
+  <td>`100`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.stopTimeout</tt></td>
+  <td>
+    The amount of time in ms for the Gravitino Iceberg REST catalog service to 
stop gracefully.
+    For more information, see `org.eclipse.jetty.server.Server#setStopTimeout`.
+  </td>
+  <td>`30000`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.idleTimeout</tt></td>
+  <td>The timeout in ms of idle connections.</td>
+  <td>`30000`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.requestHeaderSize</tt></td>
+  <td>The maximum size in bytes for a HTTP request.</td>
+  <td>`131072`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.responseHeaderSize</tt></td>
+  <td>The maximum size in bytes for a HTTP response.</td>
+  <td>`131072`</td>
+  <td>No</td>
+  <td>`0.2.0`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.customFilters</tt></td>
+  <td>
+    Comma-separated list of filter class names to apply to the APIs.
+  </td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.4.0`</td>
+</tr>
+</tbody>
+</table>
+
+The filter in `customFilters` should be a standard javax servlet filter.
+You can also specify filter parameters by setting configuration entries
+in the format `gravitino.iceberg-rest.<filter class 
name>.param.<name>=<value>`.
+
+### Security
+
+Gravitino Iceberg REST server supports OAuth2 and HTTPS,
+please refer to [security documentation](../security/index.md) for more 
details.
+
+#### Backend authentication
+
+For JDBC backend, you can use the `gravitino.iceberg-rest.jdbc-user` and 
`gravitino.iceberg-rest.jdbc-password`
+to authenticate the JDBC connection.
+For Hive backend, you can use the `gravitino.iceberg-rest.authentication.type`
+to specify the authentication type, and use the 
`gravitino.iceberg-rest.authentication.kerberos.principal`
+and `gravitino.iceberg-rest.authentication.kerberos.keytab-uri`
+to authenticate the Kerberos connection.
+The detailed configuration items are as follows:
+
+<table>
+<thead>
+<tr>
+  <th>Configuration item</th>
+  <th>Description</th>
+  <th>Default value</th>
+  <th>Required</th>
+  <th>Since Version</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+  <td><tt>gravitino.iceberg-rest.authentication.type</tt></td>
+  <td>
+    The type of authentication for Iceberg rest catalog backend.
+    This configuration only applicable for for Hive backend,
+    and only supports `Kerberos`, `simple` currently.
+    As for JDBC backend, only username/password authentication is supported 
now.
+  </td>
+  <td>`simple`</td>
+  <td>No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.authentication.impersonation-enable</tt></td>
+  <td>Whether impersonation is enabled for the Iceberg catalog service.</td>
+  <td>`false`</td>
+  <td>No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.hive.metastore.sasl.enabled</tt></td>
+  <td>
+    Whether SASL authentication protocol is enabled when connecting to 
Kerberos Hive metastore.
+
+    This value should be `true` in most case
+    when  the value of `gravitino.iceberg-rest.authentication.type` is 
Kerberos.
+    In some very rare cases, the SSL protocol is used.
+  </td>
+  <td>`false`</td>
+  <td>No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.authentication.kerberos.principal</tt></td>
+  <td>
+    The principal of the Kerberos authentication.
+
+    This field required if the value of 
`gravitino.iceberg-rest.authentication.type` is `Kerberos`.
+  </td>
+  <td>(none)</td>
+  <td>Yes|No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.authentication.kerberos.keytab-uri</tt></td>
+  <td>
+    The URI of the keytab for the Kerberos authentication.
+    This field required if the value of 
`gravitino.iceberg-rest.authentication.type` is `Kerberos`.
+  </td>
+  <td>(none)</td>
+  <td>Yes|No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+<tr>
+  
<td><tt>gravitino.iceberg-rest.authentication.kerberos.check-interval-sec</tt></td>
+  <td>The check interval in seconds of Kerberos credential for Iceberg 
catalog.</td>
+  <td>60</td>
+  <td>No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+<tr>
+  
<td><tt>gravitino.iceberg-rest.authentication.kerberos.keytab-fetch-timeout-sec</tt></td>
+  <td>
+    The fetch timeout in seconds when retrieving Kerberos keytab
+    from `authentication.kerberos.keytab-uri`.
+  </td>
+  <td>60</td>
+  <td>No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+</tbody>
+</table>
+
+### Credential vending
+
+Please refer to [credential vending](../security/credential-vending.md) for 
more details.
+
+### Storage
+
+#### S3 configuration
+
+<table>
+<thead>
+<tr>
+  <th>Configuration item</th>
+  <th>Description</th>
+  <th>Default value</th>
+  <th>Required</th>
+  <th>Since Version</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+  <td><tt>gravitino.iceberg-rest.io-impl</tt></td>
+  <td>
+    The I/O implementation for `FileIO` in Iceberg.
+    Use `org.apache.iceberg.aws.s3.S3FileIO` for S3.
+  </td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.6.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.s3-endpoint</tt></td>
+  <td>
+    An alternative endpoint of the S3 service.
+    This could be used for S3FileIO with any s3-compatible object storage 
service
+    that has a different endpoint, or access a private S3 endpoint
+    in a virtual private cloud.
+  </td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.6.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.s3-region</tt></td>
+  <td>The region of the S3 service, like `us-west-2`.</td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.6.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.s3-path-style-access</tt></td>
+  <td>Whether to use path style access for S3.</td>
+  <td>`false`</td>
+  <td>No</td>
+  <td>`0.9.0-incubating`</td>
+</tr>
+</tbody>
+</table>
+
+For other Iceberg s3 properties not managed by Gravitino like `s3.sse.type`,
+you could config it directly by `gravitino.iceberg-rest.s3.sse.type`.
+
+Please refer to [S3 
credentials](../security/credential-vending.md#s3-credentials)
+for credential related configurations.
+
+:::info
+To configure the JDBC catalog backend, set the 
`gravitino.iceberg-rest.warehouse` parameter
+to `s3://{bucket_name}/${prefix_name}`.
+For the Hive catalog backend, set `gravitino.iceberg-rest.warehouse`
+to `s3a://{bucket_name}/${prefix_name}`.
+Additionally, download the [Iceberg AWS 
bundle](https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-aws-bundle)
+and place it in the CLASSPATH of Iceberg REST server.
+:::
+
+#### OSS configuration
+
+<table>
+<thead>
+<tr>
+  <th>Configuration item</th>
+  <th>Description</th>
+  <th>Default value</th>
+  <th>Required</th>
+  <th>Since Version</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+  <td><tt>gravitino.iceberg-rest.io-impl</tt></td>
+  <td>
+    The I/O implementation for `FileIO` in Iceberg.
+    Use `org.apache.iceberg.aliyun.oss.OSSFileIO` for OSS.
+  </td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.6.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.oss-endpoint</tt></td>
+  <td>The endpoint of Aliyun OSS service.</td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.7.0-incubating`</td>
+</tr>
+<tr>
+  <td><tt>gravitino.iceberg-rest.oss-region</tt></td>
+  <td>
+    The region of the OSS service, like `oss-cn-hangzhou`.
+    Only used when `credential-providers` is `oss-token`.
+  </td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.8.0-incubating`</td>
+</tr>
+</tbody>
+</table>
+
+For other Iceberg OSS properties not managed by Gravitino like 
`client.security-token`,
+you could config it directly by `gravitino.iceberg-rest.client.security-token`.
+
+Please refer to [OSS 
credentials](../security/credential-vending.md#oss-credentials)
+for credential related configurations.
+
+Additionally, Iceberg doesn't provide Iceberg Aliyun bundle JARs which 
contains OSS packages,
+there are two alternatives to use OSS packages:
+
+1. Use [Gravitino Aliyun bundle JAR with Hadoop 
packages](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-aliyun-bundle).
+1. Use [Aliyun JAVA 
SDK](https://gosspublic.alicdn.com/sdks/java/aliyun_java_sdk_3.10.2.zip)
+   and extract `aliyun-sdk-oss-3.10.2.jar`, `hamcrest-core-1.1.jar`, 
`jdom2-2.0.6.jar`.
+
+Please place the above jars in the CLASSPATH of Iceberg REST server.
+Refer to [server management](#server-management) for CLASSPATH details.
+
+:::info
+You need to set the `gravitino.iceberg-rest.warehouse` parameter
+to `oss://{bucket_name}/${prefix_name}`. 
+:::
+
+#### GCS
+
+Supports using static GCS credential file or generating GCS token to access 
GCS data.
+
+<table>
+<thead>
+<tr>
+  <th>Configuration item</th>
+  <th>Description</th>
+  <th>Default value</th>
+  <th>Required</th>
+  <th>Since Version</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+  <td><tt>gravitino.iceberg-rest.io-impl</tt></td>
+  <td>
+    The I/O implementation for `FileIO` in Iceberg.
+    Use `org.apache.iceberg.gcp.gcs.GCSFileIO` for GCS.
+  </td>
+  <td>(none)</td>
+  <td>No</td>
+  <td>`0.6.0-incubating`</td>
+</tr>
+</tbody>
+</table>
+
+For other Iceberg GCS properties not managed by Gravitino like 
`gcs.project-id`,
+you can config it directly using `gravitino.iceberg-rest.gcs.project-id`.
+
+Please refer to [GCS 
credentials](../security/credential-vending.md#gcs-credentials)
+for credential related configurations.

Review Comment:
   for configuration related to credentials.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to