This is an automated email from the ASF dual-hosted git repository. liuxun pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/gravitino.git
The following commit(s) were added to refs/heads/main by this push: new db972abaa [minor] improvement(docs): adjust table properties docs (#5699) db972abaa is described below commit db972abaadf27d224ed0eecd36a6d89cbca30270 Author: JUN <oren....@gmail.com> AuthorDate: Thu Dec 5 10:13:02 2024 +0800 [minor] improvement(docs): adjust table properties docs (#5699) ### What changes were proposed in this pull request? Adjust table properties documentation. ### Why are the changes needed? Add missing information about certain immutable fields and improve readability. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Previewed the markdown output. --- docs/apache-hive-catalog.md | 40 ++++++++++++++++++++------------------- docs/jdbc-mysql-catalog.md | 24 +++++++++++++++-------- docs/lakehouse-iceberg-catalog.md | 38 +++++++++++++++++++------------------ docs/lakehouse-paimon-catalog.md | 35 +++++++++++++++++----------------- 4 files changed, 74 insertions(+), 63 deletions(-) diff --git a/docs/apache-hive-catalog.md b/docs/apache-hive-catalog.md index c789059c0..09bded2c3 100644 --- a/docs/apache-hive-catalog.md +++ b/docs/apache-hive-catalog.md @@ -140,25 +140,27 @@ Since 0.6.0-incubating, the data types other than listed above are mapped to Gra Table properties supply or set metadata for the underlying Hive tables. The following table lists predefined table properties for a Hive table. Additionally, you can define your own key-value pair properties and transmit them to the underlying Hive database. -| Property Name | Description | Default Value | Required | Since version | -|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|----------|---------------| -| `location` | The location for table storage, such as `/user/hive/warehouse/test_table`. | HMS uses the database location as the parent directory by default. | No | 0.2.0 | -| `table-type` | Type of the table. Valid values include `MANAGED_TABLE` and `EXTERNAL_TABLE`. | `MANAGED_TABLE` | No | 0.2.0 | -| `format` | The table file format. Valid values include `TEXTFILE`, `SEQUENCEFILE`, `RCFILE`, `ORC`, `PARQUET`, `AVRO`, `JSON`, `CSV`, and `REGEX`. | `TEXTFILE` | No | 0.2.0 | -| `input-format` | The input format class for the table, such as `org.apache.hadoop.hive.ql.io.orc.OrcInputFormat`. | The property `format` sets the default value `org.apache.hadoop.mapred.TextInputFormat` and can change it to a different default. | No | 0.2.0 | -| `output-format` | The output format class for the table, such as `org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat`. | The property `format` sets the default value `org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat` and can change it to a different default. | No | 0.2.0 | -| `serde-lib` | The serde library class for the table, such as `org.apache.hadoop.hive.ql.io.orc.OrcSerde`. | The property `format` sets the default value `org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe` and can change it to a different default. | No | 0.2.0 | -| `serde.parameter.` | The prefix of the serde parameter, such as `"serde.parameter.orc.create.index" = "true"`, indicating `ORC` serde lib to create row indexes | (none) | No | 0.2.0 | - -Hive automatically adds and manages some reserved properties. Users aren't allowed to set these properties. - -| Property Name | Description | Since Version | -|-------------------------|-------------------------------------------------|---------------| -| `comment` | Used to store a table comment. | 0.2.0 | -| `numFiles` | Used to store the number of files in the table. | 0.2.0 | -| `totalSize` | Used to store the total size of the table. | 0.2.0 | -| `EXTERNAL` | Indicates whether the table is external. | 0.2.0 | -| `transient_lastDdlTime` | Used to store the last DDL time of the table. | 0.2.0 | +:::note +**Reserved**: Fields that cannot be passed to the Gravitino server. + +**Immutable**: Fields that cannot be modified once set. +::: + +| Property Name | Description | Default Value | Required | Reserved | Immutable | Since Version | +|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|----------|----------|-----------|---------------| +| `location` | The location for table storage, such as `/user/hive/warehouse/test_table`. | HMS uses the database location as the parent directory by default. | No | No | Yes | 0.2.0 | +| `table-type` | Type of the table. Valid values include `MANAGED_TABLE` and `EXTERNAL_TABLE`. | `MANAGED_TABLE` | No | No | Yes | 0.2.0 | +| `format` | The table file format. Valid values include `TEXTFILE`, `SEQUENCEFILE`, `RCFILE`, `ORC`, `PARQUET`, `AVRO`, `JSON`, `CSV`, and `REGEX`. | `TEXTFILE` | No | No | Yes | 0.2.0 | +| `input-format` | The input format class for the table, such as `org.apache.hadoop.hive.ql.io.orc.OrcInputFormat`. | The property `format` sets the default value `org.apache.hadoop.mapred.TextInputFormat` and can change it to a different default. | No | No | Yes | 0.2.0 | +| `output-format` | The output format class for the table, such as `org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat`. | The property `format` sets the default value `org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat` and can change it to a different default. | No | No | Yes | 0.2.0 | +| `serde-lib` | The serde library class for the table, such as `org.apache.hadoop.hive.ql.io.orc.OrcSerde`. | The property `format` sets the default value `org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe` and can change it to a different default. | No | No | Yes | 0.2.0 | +| `serde.parameter.` | The prefix of the serde parameter, such as `"serde.parameter.orc.create.index" = "true"`, indicating `ORC` serde lib to create row indexes | (none) | No | No | Yes | 0.2.0 | +| `serde-name` | The name of the serde | Table name by default. | No | No | Yes | 0.2.0 | +| `comment` | Used to store a table comment. | (none) | No | Yes | No | 0.2.0 | +| `numFiles` | Used to store the number of files in the table. | (none) | No | Yes | No | 0.2.0 | +| `totalSize` | Used to store the total size of the table. | (none) | No | Yes | No | 0.2.0 | +| `EXTERNAL` | Indicates whether the table is external. | (none) | No | Yes | No | 0.2.0 | +| `transient_lastDdlTime` | Used to store the last DDL time of the table. | (none) | No | Yes | No | 0.2.0 | ### Table indexes diff --git a/docs/jdbc-mysql-catalog.md b/docs/jdbc-mysql-catalog.md index cca3b1603..6c228faf6 100644 --- a/docs/jdbc-mysql-catalog.md +++ b/docs/jdbc-mysql-catalog.md @@ -96,7 +96,7 @@ Refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metada | `Integer` | `Int` | | `Unsigned Integer` | `Int Unsigned` | | `Long` | `Bigint` | -| `Unsigned Long` | `Bigint Unsigned` | +| `Unsigned Long` | `Bigint Unsigned` | | `Float` | `Float` | | `Double` | `Double` | | `String` | `Text` | @@ -160,7 +160,7 @@ Column[] cols = new Column[] { }; Index[] indexes = new Index[] { Indexes.of(IndexType.PRIMARY_KEY, "PRIMARY", new String[][]{{"id"}}) -} +}; ``` </TabItem> @@ -170,12 +170,20 @@ Index[] indexes = new Index[] { Although MySQL itself does not support table properties, Gravitino offers table property management for MySQL tables through the `jdbc-mysql` catalog, enabling control over table features. The supported properties are listed as follows: -| Property Name | Description | Required | Since version | -|-------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|---------------| -| `engine` | The engine used by the table. The default value is `InnoDB`. For example `MyISAM`, `MEMORY`, `CSV`, `ARCHIVE`, `BLACKHOLE`, `FEDERATED`, `ndbinfo`, `MRG_MYISAM`, `PERFORMANCE_SCHEMA`. | No | 0.4.0 | -| `auto-increment-offset` | Used to specify the starting value of the auto-increment field. | No | 0.4.0 | +:::note +**Reserved**: Fields that cannot be passed to the Gravitino server. + +**Immutable**: Fields that cannot be modified once set. +::: + +:::caution +- Doesn't support remove table properties. You can only add or modify properties, not delete properties. +::: -- Doesn't support remove table properties. You can only modify values, not delete properties. +| Property Name | Description | Default Value | Required | Reserved | Immutable | Since version | +|-------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|-----------|------------|-----------|---------------| +| `engine` | The engine used by the table. For example `MyISAM`, `MEMORY`, `CSV`, `ARCHIVE`, `BLACKHOLE`, `FEDERATED`, `ndbinfo`, `MRG_MYISAM`, `PERFORMANCE_SCHEMA`. | `InnoDB` | No | No | Yes | 0.4.0 | +| `auto-increment-offset` | Used to specify the starting value of the auto-increment field. | (none) | No | No | Yes | 0.4.0 | ### Table indexes @@ -213,7 +221,7 @@ The index name of the PRIMARY_KEY must be PRIMARY Index[] indexes = new Index[] { Indexes.of(IndexType.PRIMARY_KEY, "PRIMARY", new String[][]{{"id"}}), Indexes.of(IndexType.UNIQUE_KEY, "id_name_uk", new String[][]{{"id"} , {"name"}}), -} +}; ``` </TabItem> diff --git a/docs/lakehouse-iceberg-catalog.md b/docs/lakehouse-iceberg-catalog.md index 28b9b37a9..393ef26b8 100644 --- a/docs/lakehouse-iceberg-catalog.md +++ b/docs/lakehouse-iceberg-catalog.md @@ -32,7 +32,7 @@ Builds with Apache Iceberg `1.5.2`. The Apache Iceberg table format version is ` - S3 - HDFS - OSS -- Supports Kerberos or simple authentication for Iceberg catalog with Hive backend. +- Supports Kerberos or simple authentication for Iceberg catalog with Hive backend. ### Catalog properties @@ -151,7 +151,7 @@ Users can use the following properties to configure the security of the catalog Please refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.md#catalog-operations) for more details. -## Schema +## Schema ### Schema capabilities @@ -165,7 +165,7 @@ You could put properties except `comment`. Please refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.md#schema-operations) for more details. -## Table +## Table ### Table capabilities @@ -313,23 +313,25 @@ Meanwhile, the data types other than listed above are mapped to Gravitino **[Ext You can pass [Iceberg table properties](https://iceberg.apache.org/docs/1.5.2/configuration/) to Gravitino when creating an Iceberg table. -The Gravitino server doesn't allow passing the following reserved fields. +:::note +**Reserved**: Fields that cannot be passed to the Gravitino server. -| Configuration item | Description | Since Version | -|---------------------------|--------------------------------------------------------------------------------------|---------------| -| `comment` | The table comment, please use `comment` field in table meta instead. | 0.2.0 | -| `creator` | The table creator. | 0.2.0 | -| `current-snapshot-id` | The snapshot represents the current state of the table. | 0.2.0 | -| `cherry-pick-snapshot-id` | Selecting a specific snapshot in a merge operation. | 0.2.0 | -| `sort-order` | Iceberg table sort order, please use `SortOrder` in table meta instead. | 0.2.0 | -| `identifier-fields` | The identifier fields for defining the table. | 0.2.0 | -| `write.distribution-mode` | Defines distribution of write data, please use `distribution` in table meta instead. | 0.2.0 | - -Gravitino server doesn't allow to change such properties: +**Immutable**: Fields that cannot be modified once set. +::: -| Configuration item | Description | Default value | Required | Since Version | -|--------------------|----------------------------------------------|---------------|----------|---------------| -| `location` | Iceberg location for table storage. | None | No | 0.2.0 | +| Configuration item | Description | Default value | Required | Reserved | Immutable | Since Version | +|---------------------------|---------------------------------------------------------------------------------------|---------------|----------|----------|-----------|---------------| +| `location` | Iceberg location for table storage. | (none) | No | No | Yes | 0.2.0 | +| `provider` | The storage provider for table storage. | (none) | No | No | Yes | 0.2.0 | +| `format` | The format of table storage. | (none) | No | No | Yes | 0.2.0 | +| `format-version` | The format version of table storage. | (none) | No | No | Yes | 0.2.0 | +| `comment` | The table comment, please use `comment` field in table meta instead. | (none) | No | Yes | No | 0.2.0 | +| `creator` | The table creator. | (none) | No | Yes | No | 0.2.0 | +| `current-snapshot-id` | The snapshot represents the current state of the table. | (none) | No | Yes | No | 0.2.0 | +| `cherry-pick-snapshot-id` | Selecting a specific snapshot in a merge operation. | (none) | No | Yes | No | 0.2.0 | +| `sort-order` | Iceberg table sort order, please use `SortOrder` in table meta instead. | (none) | No | Yes | No | 0.2.0 | +| `identifier-fields` | The identifier fields for defining the table. | (none) | No | Yes | No | 0.2.0 | +| `write.distribution-mode` | Defines distribution of write data, please use `distribution` in table meta instead. | (none) | No | Yes | No | 0.2.0 | ### Table indexes diff --git a/docs/lakehouse-paimon-catalog.md b/docs/lakehouse-paimon-catalog.md index 61b1449e9..d53ad4827 100644 --- a/docs/lakehouse-paimon-catalog.md +++ b/docs/lakehouse-paimon-catalog.md @@ -48,7 +48,7 @@ Builds with Apache Paimon `0.8.0`. | `s3-secret-access-key` | The secret key of the AWS S3. | (none) | required if the value of `warehouse` is a S3 path | 0.7.0-incubating | :::note -If you want to use the `oss` or `s3` warehouse, you need to place related jars in the `catalogs/lakehouse-paimon/lib` directory, more information can be found in the [Paimon S3](https://paimon.apache.org/docs/master/filesystems/s3/). +If you want to use the `oss` or `s3` warehouse, you need to place related jars in the `catalogs/lakehouse-paimon/lib` directory, more information can be found in the [Paimon S3](https://paimon.apache.org/docs/master/filesystems/s3/). ::: :::note @@ -75,7 +75,7 @@ You must download the corresponding JDBC driver and place it to the `catalogs/la Please refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.md#catalog-operations) for more details. -## Schema +## Schema ### Schema capabilities @@ -94,7 +94,7 @@ Please refer to [Manage Relational Metadata Using Gravitino](./manage-relational Please refer to [Manage Relational Metadata Using Gravitino](./manage-relational-metadata-using-gravitino.md#schema-operations) for more details. -## Table +## Table ### Table capabilities @@ -185,23 +185,22 @@ Gravitino doesn't support Paimon `MultisetType` type. You can pass [Paimon table properties](https://paimon.apache.org/docs/0.8/maintenance/configurations/) to Gravitino when creating a Paimon table. -The Gravitino server doesn't allow passing the following reserved fields. - -| Configuration item | Description | -|------------------------------------|--------------------------------------------------------------| -| `comment` | The table comment. | -| `owner` | The table owner. | -| `bucket-key` | The table bucket-key. | -| `primary-key` | The table primary-key. | -| `partition` | The table partition. | +:::note +**Reserved**: Fields that cannot be passed to the Gravitino server. -The Gravitino server doesn't allow the following immutable fields to be modified, but allows them to be specified when creating a new table. +**Immutable**: Fields that cannot be modified once set. +::: -| Configuration item | Description | -|------------------------------------|--------------------------------------------------------------| -| `merge-engine` | The table merge-engine. | -| `sequence.field` | The table sequence.field. | -| `rowkind.field` | The table rowkind.field. | +| Configuration item | Description | Default Value | Required | Reserved | Immutable | Since version | +|------------------------------------|--------------------------------------------------------------|---------------|-----------|----------|-----------|-------------------| +| `merge-engine` | The table merge-engine. | (none) | No | No | Yes | 0.6.0-incubating | +| `sequence.field` | The table sequence.field. | (none) | No | No | Yes | 0.6.0-incubating | +| `rowkind.field` | The table rowkind.field. | (none) | No | No | Yes | 0.6.0-incubating | +| `comment` | The table comment. | (none) | No | Yes | No | 0.6.0-incubating | +| `owner` | The table owner. | (none) | No | Yes | No | 0.6.0-incubating | +| `bucket-key` | The table bucket-key. | (none) | No | Yes | No | 0.6.0-incubating | +| `primary-key` | The table primary-key. | (none) | No | Yes | No | 0.6.0-incubating | +| `partition` | The table partition. | (none) | No | Yes | No | 0.6.0-incubating | ### Table operations