Re: [PR] [FLINK-16080][docs-zh] Translate new DISTRIBUTED BY documentation into Chinese [flink]

via GitHub Mon, 19 May 2025 20:19:19 -0700


RocMarshal commented on code in PR #26196:
URL: https://github.com/apache/flink/pull/26196#discussion_r2096784440



##########
docs/content.zh/docs/dev/table/sql/create.md:
##########
@@ -414,33 +414,40 @@ Flink 假设声明了主键的列都是不包含 Null 值的，Connector 在处
 
 ### `DISTRIBUTED`
 
-Buckets enable load balancing in an external storage system by splitting data 
into disjoint subsets. These subsets group rows with potentially "infinite" 
keyspace into smaller and more manageable chunks that allow for efficient 
parallel processing.
+分桶通过将数据拆分为互不相交的子集，实现外部存储系统的负载均衡。
+这些子集将理论上具有 “无限 ”键空间的行划分为更小且更易于管理的块，从而实现高效的并行处理。
 
-Bucketing depends heavily on the semantics of the underlying connector. 
However, a user can influence the bucketing behavior by specifying the number 
of buckets, the bucketing algorithm, and (if the algorithm allows it) the 
columns which are used for target bucket calculation.
+分桶行为在很大程度上取决于底层连接器的具体实现。不过用户仍然可以通过以下方式影响分桶行为：
 
-All bucketing components (i.e. bucket number, distribution algorithm, bucket 
key columns) are
-optional from a SQL syntax perspective.
+ 1. 指定桶的数量。
+ 2. 选择分桶算法。
+ 3. 指定用于计算目标桶的列（如果分桶算法支持）。
 
-Given the following SQL statements:
+从 SQL 语法角度来看，所有分桶组件（即桶数量、分桶算法、分桶键列）均为可选配置。
+
+以下是不同分桶定义的 SQL 示例：
 
 ```sql
--- Example 1
+-- 示例 1
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY HASH(uid) INTO 4 
BUCKETS;
 
--- Example 2
+-- 示例 2
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid) INTO 4 
BUCKETS;
 
--- Example 3
+-- 示例 3
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid);
 
--- Example 4
+-- 示例 4
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED INTO 4 BUCKETS;
 ```
 
-Example 1 declares a hash function on a fixed number of 4 buckets (i.e. 
HASH(uid) % 4 = target
-bucket). Example 2 leaves the selection of an algorithm up to the connector. 
Additionally,
-Example 3 leaves the number of buckets up  to the connector.
-In contrast, Example 4 only defines the number of buckets.
+示例 1 明确声明了一个分桶，根据 uid 列的哈希值分配到 4 个桶（即 目标桶 = HASH(uid) % 4）。
+
+示例 2 在示例 1 的基础上将分桶算法交由连接器决定。

Review Comment:
   ```suggestion
   示例 2 仅声明了分桶列和桶的数量，剩余要素即分桶算法则由连接器决定。
   ```



##########
docs/content.zh/docs/dev/table/sql/create.md:
##########
@@ -414,33 +414,40 @@ Flink 假设声明了主键的列都是不包含 Null 值的，Connector 在处
 
 ### `DISTRIBUTED`
 
-Buckets enable load balancing in an external storage system by splitting data 
into disjoint subsets. These subsets group rows with potentially "infinite" 
keyspace into smaller and more manageable chunks that allow for efficient 
parallel processing.
+分桶通过将数据拆分为互不相交的子集，实现外部存储系统的负载均衡。
+这些子集将理论上具有 “无限 ”键空间的行划分为更小且更易于管理的块，从而实现高效的并行处理。
 
-Bucketing depends heavily on the semantics of the underlying connector. 
However, a user can influence the bucketing behavior by specifying the number 
of buckets, the bucketing algorithm, and (if the algorithm allows it) the 
columns which are used for target bucket calculation.
+分桶行为在很大程度上取决于底层连接器的具体实现。不过用户仍然可以通过以下方式影响分桶行为：
 
-All bucketing components (i.e. bucket number, distribution algorithm, bucket 
key columns) are
-optional from a SQL syntax perspective.
+ 1. 指定桶的数量。
+ 2. 选择分桶算法。
+ 3. 指定用于计算目标桶的列（如果分桶算法支持）。
 
-Given the following SQL statements:
+从 SQL 语法角度来看，所有分桶组件（即桶数量、分桶算法、分桶键列）均为可选配置。

Review Comment:
   ```suggestion
   从 SQL 语法角度来看，所有分桶元素（即桶数量、分桶算法、分桶键列）均为可选配置。
   ```
   
   or
   
   ```suggestion
   从 SQL 语法角度来看，所有分桶要素（即桶数量、分桶算法、分桶键列）均为可选配置。
   ```



##########
docs/content.zh/docs/dev/table/sql/create.md:
##########
@@ -414,33 +414,40 @@ Flink 假设声明了主键的列都是不包含 Null 值的，Connector 在处
 
 ### `DISTRIBUTED`
 
-Buckets enable load balancing in an external storage system by splitting data 
into disjoint subsets. These subsets group rows with potentially "infinite" 
keyspace into smaller and more manageable chunks that allow for efficient 
parallel processing.
+分桶通过将数据拆分为互不相交的子集，实现外部存储系统的负载均衡。
+这些子集将理论上具有 “无限 ”键空间的行划分为更小且更易于管理的块，从而实现高效的并行处理。
 
-Bucketing depends heavily on the semantics of the underlying connector. 
However, a user can influence the bucketing behavior by specifying the number 
of buckets, the bucketing algorithm, and (if the algorithm allows it) the 
columns which are used for target bucket calculation.
+分桶行为在很大程度上取决于底层连接器的具体实现。不过用户仍然可以通过以下方式影响分桶行为：
 
-All bucketing components (i.e. bucket number, distribution algorithm, bucket 
key columns) are
-optional from a SQL syntax perspective.
+ 1. 指定桶的数量。
+ 2. 选择分桶算法。
+ 3. 指定用于计算目标桶的列（如果分桶算法支持）。
 
-Given the following SQL statements:
+从 SQL 语法角度来看，所有分桶组件（即桶数量、分桶算法、分桶键列）均为可选配置。
+
+以下是不同分桶定义的 SQL 示例：
 
 ```sql
--- Example 1
+-- 示例 1
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY HASH(uid) INTO 4 
BUCKETS;
 
--- Example 2
+-- 示例 2
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid) INTO 4 
BUCKETS;
 
--- Example 3
+-- 示例 3
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid);
 
--- Example 4
+-- 示例 4
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED INTO 4 BUCKETS;
 ```
 
-Example 1 declares a hash function on a fixed number of 4 buckets (i.e. 
HASH(uid) % 4 = target
-bucket). Example 2 leaves the selection of an algorithm up to the connector. 
Additionally,
-Example 3 leaves the number of buckets up  to the connector.
-In contrast, Example 4 only defines the number of buckets.
+示例 1 明确声明了一个分桶，根据 uid 列的哈希值分配到 4 个桶（即 目标桶 = HASH(uid) % 4）。

Review Comment:
   ```suggestion
   示例 1 完整声明了一个分桶，根据 uid 列的哈希值分配到 4 个桶（即 目标桶 = HASH(uid) % 4）。
   ```



##########
docs/content.zh/docs/dev/table/sql/create.md:
##########
@@ -414,33 +414,40 @@ Flink 假设声明了主键的列都是不包含 Null 值的，Connector 在处
 
 ### `DISTRIBUTED`
 
-Buckets enable load balancing in an external storage system by splitting data 
into disjoint subsets. These subsets group rows with potentially "infinite" 
keyspace into smaller and more manageable chunks that allow for efficient 
parallel processing.
+分桶通过将数据拆分为互不相交的子集，实现外部存储系统的负载均衡。
+这些子集将理论上具有 “无限 ”键空间的行划分为更小且更易于管理的块，从而实现高效的并行处理。
 
-Bucketing depends heavily on the semantics of the underlying connector. 
However, a user can influence the bucketing behavior by specifying the number 
of buckets, the bucketing algorithm, and (if the algorithm allows it) the 
columns which are used for target bucket calculation.
+分桶行为在很大程度上取决于底层连接器的具体实现。不过用户仍然可以通过以下方式影响分桶行为：
 
-All bucketing components (i.e. bucket number, distribution algorithm, bucket 
key columns) are
-optional from a SQL syntax perspective.
+ 1. 指定桶的数量。
+ 2. 选择分桶算法。
+ 3. 指定用于计算目标桶的列（如果分桶算法支持）。
 
-Given the following SQL statements:
+从 SQL 语法角度来看，所有分桶组件（即桶数量、分桶算法、分桶键列）均为可选配置。
+
+以下是不同分桶定义的 SQL 示例：
 
 ```sql
--- Example 1
+-- 示例 1
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY HASH(uid) INTO 4 
BUCKETS;
 
--- Example 2
+-- 示例 2
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid) INTO 4 
BUCKETS;
 
--- Example 3
+-- 示例 3
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid);
 
--- Example 4
+-- 示例 4
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED INTO 4 BUCKETS;
 ```
 
-Example 1 declares a hash function on a fixed number of 4 buckets (i.e. 
HASH(uid) % 4 = target
-bucket). Example 2 leaves the selection of an algorithm up to the connector. 
Additionally,
-Example 3 leaves the number of buckets up  to the connector.
-In contrast, Example 4 only defines the number of buckets.
+示例 1 明确声明了一个分桶，根据 uid 列的哈希值分配到 4 个桶（即 目标桶 = HASH(uid) % 4）。
+
+示例 2 在示例 1 的基础上将分桶算法交由连接器决定。
+
+示例 3 更进一步，将桶的数量也交由连接器决定。

Review Comment:
   示例 3 则仅声明了分桶列，剩余要素即分桶算法和桶的数量由连接器决定。



##########
docs/content.zh/docs/dev/table/sql/create.md:
##########
@@ -414,33 +414,40 @@ Flink 假设声明了主键的列都是不包含 Null 值的，Connector 在处
 
 ### `DISTRIBUTED`
 
-Buckets enable load balancing in an external storage system by splitting data 
into disjoint subsets. These subsets group rows with potentially "infinite" 
keyspace into smaller and more manageable chunks that allow for efficient 
parallel processing.
+分桶通过将数据拆分为互不相交的子集，实现外部存储系统的负载均衡。
+这些子集将理论上具有 “无限 ”键空间的行划分为更小且更易于管理的块，从而实现高效的并行处理。
 
-Bucketing depends heavily on the semantics of the underlying connector. 
However, a user can influence the bucketing behavior by specifying the number 
of buckets, the bucketing algorithm, and (if the algorithm allows it) the 
columns which are used for target bucket calculation.
+分桶行为在很大程度上取决于底层连接器的具体实现。不过用户仍然可以通过以下方式影响分桶行为：
 
-All bucketing components (i.e. bucket number, distribution algorithm, bucket 
key columns) are
-optional from a SQL syntax perspective.
+ 1. 指定桶的数量。
+ 2. 选择分桶算法。
+ 3. 指定用于计算目标桶的列（如果分桶算法支持）。
 
-Given the following SQL statements:
+从 SQL 语法角度来看，所有分桶组件（即桶数量、分桶算法、分桶键列）均为可选配置。
+
+以下是不同分桶定义的 SQL 示例：
 
 ```sql
--- Example 1
+-- 示例 1
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY HASH(uid) INTO 4 
BUCKETS;
 
--- Example 2
+-- 示例 2
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid) INTO 4 
BUCKETS;
 
--- Example 3
+-- 示例 3
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED BY (uid);
 
--- Example 4
+-- 示例 4
 CREATE TABLE MyTable (uid BIGINT, name STRING) DISTRIBUTED INTO 4 BUCKETS;
 ```
 
-Example 1 declares a hash function on a fixed number of 4 buckets (i.e. 
HASH(uid) % 4 = target
-bucket). Example 2 leaves the selection of an algorithm up to the connector. 
Additionally,
-Example 3 leaves the number of buckets up  to the connector.
-In contrast, Example 4 only defines the number of buckets.
+示例 1 明确声明了一个分桶，根据 uid 列的哈希值分配到 4 个桶（即 目标桶 = HASH(uid) % 4）。
+
+示例 2 在示例 1 的基础上将分桶算法交由连接器决定。
+
+示例 3 更进一步，将桶的数量也交由连接器决定。
+
+示例 4 仅限定了桶的数量，其余规则由依赖连接器决定。

Review Comment:
   ```suggestion
   示例 4 仅限定了桶的数量，其余要素依赖连接器决定。
   ```



##########
docs/content.zh/docs/dev/table/sql/create.md:
##########
@@ -414,33 +414,40 @@ Flink 假设声明了主键的列都是不包含 Null 值的，Connector 在处
 
 ### `DISTRIBUTED`
 
-Buckets enable load balancing in an external storage system by splitting data 
into disjoint subsets. These subsets group rows with potentially "infinite" 
keyspace into smaller and more manageable chunks that allow for efficient 
parallel processing.
+分桶通过将数据拆分为互不相交的子集，实现外部存储系统的负载均衡。
+这些子集将理论上具有 “无限 ”键空间的行划分为更小且更易于管理的块，从而实现高效的并行处理。
 
-Bucketing depends heavily on the semantics of the underlying connector. 
However, a user can influence the bucketing behavior by specifying the number 
of buckets, the bucketing algorithm, and (if the algorithm allows it) the 
columns which are used for target bucket calculation.
+分桶行为在很大程度上取决于底层连接器的具体实现。不过用户仍然可以通过以下方式影响分桶行为：
 
-All bucketing components (i.e. bucket number, distribution algorithm, bucket 
key columns) are
-optional from a SQL syntax perspective.
+ 1. 指定桶的数量。
+ 2. 选择分桶算法。
+ 3. 指定用于计算目标桶的列（如果分桶算法支持）。
 
-Given the following SQL statements:
+从 SQL 语法角度来看，所有分桶组件（即桶数量、分桶算法、分桶键列）均为可选配置。
+
+以下是不同分桶定义的 SQL 示例：

Review Comment:
   给定以下 SQL 语句：



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-16080][docs-zh] Translate new DISTRIBUTED BY documentation into Chinese [flink]

Reply via email to