This is an automated email from the ASF dual-hosted git repository. roryqi pushed a commit to branch ISSUE-6353 in repository https://gitbox.apache.org/repos/asf/gravitino.git
commit 7da2e4d35f6be008a167b7ba3b4a5cc477ff869c Author: Qi Yu <y...@datastrato.com> AuthorDate: Thu Jan 16 11:39:28 2025 +0800 [#6249] fix(docs): Fix incorrect description about configuration `endpoint` in s3 catalog (#6265) ### What changes were proposed in this pull request? `s3-endpoint` is a required config for Hadoop File System to access S3, but it's a optional value via pyarrow s3fs ### Why are the changes needed? It's a misdescription Fix: #6249 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? N/A --- docs/hadoop-catalog-with-s3.md | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/docs/hadoop-catalog-with-s3.md b/docs/hadoop-catalog-with-s3.md index f138276189..2c8f8131b5 100644 --- a/docs/hadoop-catalog-with-s3.md +++ b/docs/hadoop-catalog-with-s3.md @@ -28,14 +28,14 @@ Once the server is up and running, you can proceed to configure the Hadoop catal In addition to the basic configurations mentioned in [Hadoop-catalog-catalog-configuration](./hadoop-catalog.md#catalog-properties), the following properties are necessary to configure a Hadoop catalog with S3: -| Configuration item | Description [...] -|--------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- [...] -| `filesystem-providers` | The file system providers to add. Set it to `s3` if it's a S3 fileset, or a comma separated string that contains `s3` like `gs,s3` to support multiple kinds of fileset including `s3`. [...] -| `default-filesystem-provider` | The name default filesystem providers of this Hadoop catalog if users do not specify the scheme in the URI. Default value is `builtin-local`, for S3, if we set this value, we can omit the prefix 's3a://' in the location. [...] -| `s3-endpoint` | The endpoint of the AWS S3. This configuration is optional for S3 service, but required for other S3-compatible storage services like MinIO. [...] -| `s3-access-key-id` | The access key of the AWS S3. [...] -| `s3-secret-access-key` | The secret key of the AWS S3. [...] -| `credential-providers` | The credential provider types, separated by comma, possible value can be `s3-token`, `s3-secret-key`. As the default authentication type is using AKSK as the above, this configuration can enable credential vending provided by Gravitino server and client will no longer need to provide authentication information like AKSK to access S3 by GVFS. Once it's set, more configuration items are needed to make it works, please see [s3-credential-vending](security/ [...] +| Configuration item | Description | Default value | Required | Since version | +|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------|----------|------------------| +| `filesystem-providers` | The file system providers to add. Set it to `s3` if it's a S3 fileset, or a comma separated string that contains `s3` like `gs,s3` to support multiple kinds of fileset including `s3`. | (none) | Yes | 0.7.0-incubating | +| `default-filesystem-provider` | The name default filesystem providers of this Hadoop catalog if users do not specify the scheme in the URI. Default value is `builtin-local`, for S3, if we set this value, we can omit the prefix 's3a://' in the location. | `builtin-local` | No | 0.7.0-incubating | +| `s3-endpoint` | The endpoint of the AWS S3. | (none) | Yes | 0.7.0-incubating | +| `s3-access-key-id` | The access key of the AWS S3. | (none) | Yes | 0.7.0-incubating | +| `s3-secret-access-key` | The secret key of the AWS S3. | (none) | Yes | 0.7.0-incubating | +| `credential-providers` | The credential provider types, separated by comma, possible value can be `s3-token`, `s3-secret-key`. As the default authentication type is using AKSK as the above, this configuration can enable credential vending provided by Gravitino server and client will no longer need to provide authentication information like AKSK to access S3 by GVFS. Once it's set, more configuration items are needed to make it works, please see [s3-credential-vending](security/ [...] ### Configurations for a schema @@ -245,14 +245,13 @@ catalog.as_fileset_catalog().create_fileset(ident=NameIdentifier.of("schema", "e To access fileset with S3 using the GVFS Java client, based on the [basic GVFS configurations](./how-to-use-gvfs.md#configuration-1), you need to add the following configurations: -| Configuration item | Description | Default value | Required | Since version | -|------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|---------------|----------|------------------| -| `s3-endpoint` | The endpoint of the AWS S3. This configuration is optional for S3 service, but required for other S3-compatible storage services like MinIO. | (none) | No | 0.7.0-incubating | -| `s3-access-key-id` | The access key of the AWS S3. | (none) | Yes | 0.7.0-incubating | -| `s3-secret-access-key` | The secret key of the AWS S3. | (none) | Yes | 0.7.0-incubating | +| Configuration item | Description | Default value | Required | Since version | +|------------------------|-------------------------------|---------------|----------|------------------| +| `s3-endpoint` | The endpoint of the AWS S3. | (none) | Yes | 0.7.0-incubating | +| `s3-access-key-id` | The access key of the AWS S3. | (none) | Yes | 0.7.0-incubating | +| `s3-secret-access-key` | The secret key of the AWS S3. | (none) | Yes | 0.7.0-incubating | :::note -- `s3-endpoint` is an optional configuration for AWS S3, however, it is required for other S3-compatible storage services like MinIO. - If the catalog has enabled [credential vending](security/credential-vending.md), the properties above can be omitted. More details can be found in [Fileset with credential vending](#fileset-with-credential-vending). ::: @@ -447,7 +446,7 @@ In order to access fileset with S3 using the GVFS Python client, apart from [bas | `s3_secret_access_key` | The secret key of the AWS S3. | (none) | Yes | 0.7.0-incubating | :::note -- `s3_endpoint` is an optional configuration for AWS S3, however, it is required for other S3-compatible storage services like MinIO. +- `s3_endpoint` is an optional configuration for GVFS **Python** client but a required configuration for GVFS **Java** client to access Hadop with AWS S3, and it is required for other S3-compatible storage services like MinIO. - If the catalog has enabled [credential vending](security/credential-vending.md), the properties above can be omitted. :::