HuangXingBo commented on a change in pull request #14088: URL: https://github.com/apache/flink/pull/14088#discussion_r533856595
########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 Review comment: ```suggestion *Client* 不是运行时和程序执行的一部分,而是用于准备dataflow并将其发送给 JobManager。之后,客户端可以断开连接(_detached mode_),或保持连接来接收进度报告(_attached mode_)。客户端既可以运行Java/Scala的程序来触发执行,也可以通过命令行`./bin/flink run ...`的方式执行。 ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 + +可以通过各种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源框架管理启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 Review comment: ```suggestion 可以通过多种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源管理框架启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 + +可以通过各种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源框架管理启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 ### JobManager -The _JobManager_ has a number of responsibilities related to coordinating the distributed execution of Flink Applications: -it decides when to schedule the next task (or set of tasks), reacts to finished -tasks or execution failures, coordinates checkpoints, and coordinates recovery on -failures, among others. This process consists of three different components: +_JobManager_具有许多与协调 Flink 应用程序的分布式执行有关的职责:它决定何时调度下一个 task(或一组 task)、对完成的 task 或执行失败做出反应、协调 checkpoint、并且协调从失败中恢复等等。这个进程由三个不同的组件组成: * **ResourceManager** - The _ResourceManager_ is responsible for resource de-/allocation and - provisioning in a Flink cluster — it manages **task slots**, which are the - unit of resource scheduling in a Flink cluster (see [TaskManagers](#taskmanagers)). - Flink implements multiple ResourceManagers for different environments and - resource providers such as YARN, Mesos, Kubernetes and standalone - deployments. In a standalone setup, the ResourceManager can only distribute - the slots of available TaskManagers and cannot start new TaskManagers on - its own. + _ResourceManager_负责 Flink 集群中的资源删除/分配和供应 - 它管理 **task slots**,这是 Flink 集群中资源调度的单位(请参考[TaskManagers](#taskmanagers))。Flink 为不同的环境和资源提供者(例如 YARN、Mesos、Kubernetes 和 standalone 部署)实现了多个 ResourceManager。在 standalone 设置中,ResourceManager 只能分配可用 TaskManager 的 slots,而不能自行启动新的 TaskManager。 * **Dispatcher** - The _Dispatcher_ provides a REST interface to submit Flink applications for - execution and starts a new JobMaster for each submitted job. It - also runs the Flink WebUI to provide information about job executions. + _Dispatcher_ 提供了一个 REST 接口,用来提交 Flink 应用程序执行,并为每个提交的作业启动一个新的 JobMaster。它还运行 Flink WebUI 用来提供作业执行信息。 * **JobMaster** - A _JobMaster_ is responsible for managing the execution of a single - [JobGraph]({% link concepts/glossary.zh.md %}#logical-graph). - Multiple jobs can run simultaneously in a Flink cluster, each having its - own JobMaster. + _JobMaster_ 负责管理单个[JobGraph]({% link concepts/glossary.zh.md %}#logical-graph)的执行。Flink 集群中可以同时运行多个作业,每个作业都有自己的 JobMaster。 -There is always at least one JobManager. A high-availability setup might have -multiple JobManagers, one of which is always the *leader*, and the others are -*standby* (see [High Availability (HA)]({% link deployment/ha/index.zh.md %})). +始终至少有一个 JobManager。高可用设置中可能有多个 JobManager,其中一个始终是 *leader*,其他的则是 *standby*(请参考 [高可用(HA)]({% link deployment/ha/index.zh.md %}))。 ### TaskManagers -The *TaskManagers* (also called *workers*) execute the tasks of a dataflow, and buffer and exchange the data -streams. +*TaskManager*(也称为 *worker*)执行数据流的 task,并且缓存和交换数据流。 -There must always be at least one TaskManager. The smallest unit of resource scheduling in a TaskManager is a task _slot_. The number of task slots in a -TaskManager indicates the number of concurrent processing tasks. Note that -multiple operators may execute in a task slot (see [Tasks and Operator -Chains](#tasks-and-operator-chains)). +必须始终至少有一个 TaskManager。在 TaskManager 中资源调度的最小单位是 task _slot_。TaskManager 中 task slot 的数量表示并发处理 task 的数量。请注意一个 task slot 中可以执行多个算子(请参考[Tasks 和算子链](#tasks-and-operator-chains))。 {% top %} -## Tasks and Operator Chains +## Tasks 和算子链 Review comment: ```suggestion ## Tasks 和 Operator Chains ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 + +可以通过各种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源框架管理启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 ### JobManager -The _JobManager_ has a number of responsibilities related to coordinating the distributed execution of Flink Applications: -it decides when to schedule the next task (or set of tasks), reacts to finished -tasks or execution failures, coordinates checkpoints, and coordinates recovery on -failures, among others. This process consists of three different components: +_JobManager_具有许多与协调 Flink 应用程序的分布式执行有关的职责:它决定何时调度下一个 task(或一组 task)、对完成的 task 或执行失败做出反应、协调 checkpoint、并且协调从失败中恢复等等。这个进程由三个不同的组件组成: * **ResourceManager** - The _ResourceManager_ is responsible for resource de-/allocation and - provisioning in a Flink cluster — it manages **task slots**, which are the - unit of resource scheduling in a Flink cluster (see [TaskManagers](#taskmanagers)). - Flink implements multiple ResourceManagers for different environments and - resource providers such as YARN, Mesos, Kubernetes and standalone - deployments. In a standalone setup, the ResourceManager can only distribute - the slots of available TaskManagers and cannot start new TaskManagers on - its own. + _ResourceManager_负责 Flink 集群中的资源删除/分配和供应 - 它管理 **task slots**,这是 Flink 集群中资源调度的单位(请参考[TaskManagers](#taskmanagers))。Flink 为不同的环境和资源提供者(例如 YARN、Mesos、Kubernetes 和 standalone 部署)实现了多个 ResourceManager。在 standalone 设置中,ResourceManager 只能分配可用 TaskManager 的 slots,而不能自行启动新的 TaskManager。 * **Dispatcher** - The _Dispatcher_ provides a REST interface to submit Flink applications for - execution and starts a new JobMaster for each submitted job. It - also runs the Flink WebUI to provide information about job executions. + _Dispatcher_ 提供了一个 REST 接口,用来提交 Flink 应用程序执行,并为每个提交的作业启动一个新的 JobMaster。它还运行 Flink WebUI 用来提供作业执行信息。 * **JobMaster** - A _JobMaster_ is responsible for managing the execution of a single - [JobGraph]({% link concepts/glossary.zh.md %}#logical-graph). - Multiple jobs can run simultaneously in a Flink cluster, each having its - own JobMaster. + _JobMaster_ 负责管理单个[JobGraph]({% link concepts/glossary.zh.md %}#logical-graph)的执行。Flink 集群中可以同时运行多个作业,每个作业都有自己的 JobMaster。 -There is always at least one JobManager. A high-availability setup might have -multiple JobManagers, one of which is always the *leader*, and the others are -*standby* (see [High Availability (HA)]({% link deployment/ha/index.zh.md %})). +始终至少有一个 JobManager。高可用设置中可能有多个 JobManager,其中一个始终是 *leader*,其他的则是 *standby*(请参考 [高可用(HA)]({% link deployment/ha/index.zh.md %}))。 ### TaskManagers -The *TaskManagers* (also called *workers*) execute the tasks of a dataflow, and buffer and exchange the data -streams. +*TaskManager*(也称为 *worker*)执行数据流的 task,并且缓存和交换数据流。 -There must always be at least one TaskManager. The smallest unit of resource scheduling in a TaskManager is a task _slot_. The number of task slots in a -TaskManager indicates the number of concurrent processing tasks. Note that -multiple operators may execute in a task slot (see [Tasks and Operator -Chains](#tasks-and-operator-chains)). +必须始终至少有一个 TaskManager。在 TaskManager 中资源调度的最小单位是 task _slot_。TaskManager 中 task slot 的数量表示并发处理 task 的数量。请注意一个 task slot 中可以执行多个算子(请参考[Tasks 和算子链](#tasks-and-operator-chains))。 {% top %} -## Tasks and Operator Chains +## Tasks 和算子链 -For distributed execution, Flink *chains* operator subtasks together into -*tasks*. Each task is executed by one thread. Chaining operators together into -tasks is a useful optimization: it reduces the overhead of thread-to-thread -handover and buffering, and increases overall throughput while decreasing -latency. The chaining behavior can be configured; see the [chaining docs]({% -link dev/stream/operators/index.zh.md %}#task-chaining-and-resource-groups) for details. +对于分布式执行,Flink 将算子的 subtasks *链接*成 *tasks*。每个 task 由一个线程执行。将算子链接成 task 是个有用的优化:它减少线程间切换、缓冲的消耗,并且减少延迟的同时增加整体吞吐量。链行为是可以配置的;请参考[链文档]({% link dev/stream/operators/index.zh.md %}#task-chaining-and-resource-groups)以获取详细信息。 Review comment: ```suggestion 对于分布式执行,Flink 将算子的 subtasks *链接*成 *tasks*。每个 task 由一个线程执行。将算子链接成 task 是个有用的优化:它减少线程间切换和缓冲的开销,并且减少延迟的同时增加整体吞吐量。链行为是可以配置的;请参考[链文档]({% link dev/stream/operators/index.zh.md %}#task-chaining-and-resource-groups)以获取详细信息。 ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 Review comment: ```suggestion Flink 运行时由两种类型的进程组成:一个_JobManager_和一个或者多个_TaskManager_。 ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 + +可以通过各种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源框架管理启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 ### JobManager -The _JobManager_ has a number of responsibilities related to coordinating the distributed execution of Flink Applications: -it decides when to schedule the next task (or set of tasks), reacts to finished -tasks or execution failures, coordinates checkpoints, and coordinates recovery on -failures, among others. This process consists of three different components: +_JobManager_具有许多与协调 Flink 应用程序的分布式执行有关的职责:它决定何时调度下一个 task(或一组 task)、对完成的 task 或执行失败做出反应、协调 checkpoint、并且协调从失败中恢复等等。这个进程由三个不同的组件组成: * **ResourceManager** - The _ResourceManager_ is responsible for resource de-/allocation and - provisioning in a Flink cluster — it manages **task slots**, which are the - unit of resource scheduling in a Flink cluster (see [TaskManagers](#taskmanagers)). - Flink implements multiple ResourceManagers for different environments and - resource providers such as YARN, Mesos, Kubernetes and standalone - deployments. In a standalone setup, the ResourceManager can only distribute - the slots of available TaskManagers and cannot start new TaskManagers on - its own. + _ResourceManager_负责 Flink 集群中的资源删除/分配和供应 - 它管理 **task slots**,这是 Flink 集群中资源调度的单位(请参考[TaskManagers](#taskmanagers))。Flink 为不同的环境和资源提供者(例如 YARN、Mesos、Kubernetes 和 standalone 部署)实现了多个 ResourceManager。在 standalone 设置中,ResourceManager 只能分配可用 TaskManager 的 slots,而不能自行启动新的 TaskManager。 Review comment: ```suggestion _ResourceManager_负责 Flink 集群资源的回收/分配和调配 - 它管理 **task slots**,这是 Flink 集群中资源调度的单位(请参考[TaskManagers](#taskmanagers))。Flink 为不同的环境和资源提供者(例如 YARN、Mesos、Kubernetes 和 standalone 部署)实现了对应的 ResourceManager。在 standalone 设置中,ResourceManager 只能分配可用 TaskManager 的 slots,而不能自行启动新的 TaskManager。 ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 + +可以通过各种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源框架管理启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 ### JobManager -The _JobManager_ has a number of responsibilities related to coordinating the distributed execution of Flink Applications: -it decides when to schedule the next task (or set of tasks), reacts to finished -tasks or execution failures, coordinates checkpoints, and coordinates recovery on -failures, among others. This process consists of three different components: +_JobManager_具有许多与协调 Flink 应用程序的分布式执行有关的职责:它决定何时调度下一个 task(或一组 task)、对完成的 task 或执行失败做出反应、协调 checkpoint、并且协调从失败中恢复等等。这个进程由三个不同的组件组成: * **ResourceManager** - The _ResourceManager_ is responsible for resource de-/allocation and - provisioning in a Flink cluster — it manages **task slots**, which are the - unit of resource scheduling in a Flink cluster (see [TaskManagers](#taskmanagers)). - Flink implements multiple ResourceManagers for different environments and - resource providers such as YARN, Mesos, Kubernetes and standalone - deployments. In a standalone setup, the ResourceManager can only distribute - the slots of available TaskManagers and cannot start new TaskManagers on - its own. + _ResourceManager_负责 Flink 集群中的资源删除/分配和供应 - 它管理 **task slots**,这是 Flink 集群中资源调度的单位(请参考[TaskManagers](#taskmanagers))。Flink 为不同的环境和资源提供者(例如 YARN、Mesos、Kubernetes 和 standalone 部署)实现了多个 ResourceManager。在 standalone 设置中,ResourceManager 只能分配可用 TaskManager 的 slots,而不能自行启动新的 TaskManager。 * **Dispatcher** - The _Dispatcher_ provides a REST interface to submit Flink applications for - execution and starts a new JobMaster for each submitted job. It - also runs the Flink WebUI to provide information about job executions. + _Dispatcher_ 提供了一个 REST 接口,用来提交 Flink 应用程序执行,并为每个提交的作业启动一个新的 JobMaster。它还运行 Flink WebUI 用来提供作业执行信息。 * **JobMaster** - A _JobMaster_ is responsible for managing the execution of a single - [JobGraph]({% link concepts/glossary.zh.md %}#logical-graph). - Multiple jobs can run simultaneously in a Flink cluster, each having its - own JobMaster. + _JobMaster_ 负责管理单个[JobGraph]({% link concepts/glossary.zh.md %}#logical-graph)的执行。Flink 集群中可以同时运行多个作业,每个作业都有自己的 JobMaster。 -There is always at least one JobManager. A high-availability setup might have -multiple JobManagers, one of which is always the *leader*, and the others are -*standby* (see [High Availability (HA)]({% link deployment/ha/index.zh.md %})). +始终至少有一个 JobManager。高可用设置中可能有多个 JobManager,其中一个始终是 *leader*,其他的则是 *standby*(请参考 [高可用(HA)]({% link deployment/ha/index.zh.md %}))。 ### TaskManagers -The *TaskManagers* (also called *workers*) execute the tasks of a dataflow, and buffer and exchange the data -streams. +*TaskManager*(也称为 *worker*)执行数据流的 task,并且缓存和交换数据流。 -There must always be at least one TaskManager. The smallest unit of resource scheduling in a TaskManager is a task _slot_. The number of task slots in a -TaskManager indicates the number of concurrent processing tasks. Note that -multiple operators may execute in a task slot (see [Tasks and Operator -Chains](#tasks-and-operator-chains)). +必须始终至少有一个 TaskManager。在 TaskManager 中资源调度的最小单位是 task _slot_。TaskManager 中 task slot 的数量表示并发处理 task 的数量。请注意一个 task slot 中可以执行多个算子(请参考[Tasks 和算子链](#tasks-and-operator-chains))。 Review comment: ```suggestion 必须始终至少有一个 TaskManager。在 TaskManager 中资源调度的最小单位是 task _slot_。TaskManager 中 task slot 的数量表示并发处理 task 的数量。请注意一个 task slot 中可以执行多个算子(请参考[Tasks 和 Operator Chains](#tasks-and-operator-chains))。 ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 + +可以通过各种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源框架管理启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 ### JobManager -The _JobManager_ has a number of responsibilities related to coordinating the distributed execution of Flink Applications: -it decides when to schedule the next task (or set of tasks), reacts to finished -tasks or execution failures, coordinates checkpoints, and coordinates recovery on -failures, among others. This process consists of three different components: +_JobManager_具有许多与协调 Flink 应用程序的分布式执行有关的职责:它决定何时调度下一个 task(或一组 task)、对完成的 task 或执行失败做出反应、协调 checkpoint、并且协调从失败中恢复等等。这个进程由三个不同的组件组成: * **ResourceManager** - The _ResourceManager_ is responsible for resource de-/allocation and - provisioning in a Flink cluster — it manages **task slots**, which are the - unit of resource scheduling in a Flink cluster (see [TaskManagers](#taskmanagers)). - Flink implements multiple ResourceManagers for different environments and - resource providers such as YARN, Mesos, Kubernetes and standalone - deployments. In a standalone setup, the ResourceManager can only distribute - the slots of available TaskManagers and cannot start new TaskManagers on - its own. + _ResourceManager_负责 Flink 集群中的资源删除/分配和供应 - 它管理 **task slots**,这是 Flink 集群中资源调度的单位(请参考[TaskManagers](#taskmanagers))。Flink 为不同的环境和资源提供者(例如 YARN、Mesos、Kubernetes 和 standalone 部署)实现了多个 ResourceManager。在 standalone 设置中,ResourceManager 只能分配可用 TaskManager 的 slots,而不能自行启动新的 TaskManager。 * **Dispatcher** - The _Dispatcher_ provides a REST interface to submit Flink applications for - execution and starts a new JobMaster for each submitted job. It - also runs the Flink WebUI to provide information about job executions. + _Dispatcher_ 提供了一个 REST 接口,用来提交 Flink 应用程序执行,并为每个提交的作业启动一个新的 JobMaster。它还运行 Flink WebUI 用来提供作业执行信息。 * **JobMaster** - A _JobMaster_ is responsible for managing the execution of a single - [JobGraph]({% link concepts/glossary.zh.md %}#logical-graph). - Multiple jobs can run simultaneously in a Flink cluster, each having its - own JobMaster. + _JobMaster_ 负责管理单个[JobGraph]({% link concepts/glossary.zh.md %}#logical-graph)的执行。Flink 集群中可以同时运行多个作业,每个作业都有自己的 JobMaster。 -There is always at least one JobManager. A high-availability setup might have -multiple JobManagers, one of which is always the *leader*, and the others are -*standby* (see [High Availability (HA)]({% link deployment/ha/index.zh.md %})). +始终至少有一个 JobManager。高可用设置中可能有多个 JobManager,其中一个始终是 *leader*,其他的则是 *standby*(请参考 [高可用(HA)]({% link deployment/ha/index.zh.md %}))。 Review comment: ```suggestion 始终至少有一个 JobManager。高可用(HA)设置中可能有多个 JobManager,其中一个始终是 *leader*,其他的则是 *standby*(请参考 [高可用(HA)]({% link deployment/ha/index.zh.md %}))。 ``` ########## File path: docs/concepts/flink-architecture.zh.md ########## @@ -24,229 +24,109 @@ specific language governing permissions and limitations under the License. --> -Flink is a distributed system and requires effective allocation and management -of compute resources in order to execute streaming applications. It integrates -with all common cluster resource managers such as [Hadoop -YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html), -[Apache Mesos](https://mesos.apache.org/) and -[Kubernetes](https://kubernetes.io/), but can also be set up to run as a -standalone cluster or even as a library. +Flink 是一个分布式系统,需要有效分配和管理计算资源才能执行流应用程序。它集成了所有常见的集群资源管理器,例如[Hadoop YARN](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)、[Apache Mesos](https://mesos.apache.org/)和[Kubernetes](https://kubernetes.io/),但也可以设置作为独立集群甚至库运行。 -This section contains an overview of Flink’s architecture and describes how its -main components interact to execute applications and recover from failures. +本节概述了 Flink 架构,并且描述了其主要组件如何交互以执行应用程序和从故障中恢复。 * This will be replaced by the TOC {:toc} -## Anatomy of a Flink Cluster +## Flink 集群剖析 -The Flink runtime consists of two types of processes: a _JobManager_ and one or more _TaskManagers_. +Flink 运行时由两种类型的进程组成:_JobManager_和一个或者多个_TaskManager_。 <img src="{% link /fig/processes.svg %}" alt="The processes involved in executing a Flink dataflow" class="offset" width="70%" /> -The *Client* is not part of the runtime and program execution, but is used to -prepare and send a dataflow to the JobManager. After that, the client can -disconnect (_detached mode_), or stay connected to receive progress reports -(_attached mode_). The client runs either as part of the Java/Scala program -that triggers the execution, or in the command line process `./bin/flink run -...`. - -The JobManager and TaskManagers can be started in various ways: directly on -the machines as a [standalone cluster]({% link -deployment/resource-providers/standalone/index.zh.md %}), in containers, or managed by resource -frameworks like [YARN]({% link deployment/resource-providers/yarn.zh.md -%}) or [Mesos]({% link deployment/resource-providers/mesos.zh.md %}). -TaskManagers connect to JobManagers, announcing themselves as available, and -are assigned work. +*Client* 不是运行时和程序执行的一部分,而是用于准备数据流并将其发送给 JobManager。之后,客户端可以断开连接(_分离模式_),或保持连接来接收进程报告(_附加模式_)。客户端可以作为触发执行 Java/Scala 程序的一部分运行,也可以在命令行进程`./bin/flink run ...`中运行。 + +可以通过各种方式启动 JobManager 和 TaskManager:直接在机器上作为[standalone 集群]({% link deployment/resource-providers/standalone/index.zh.md %})启动、在容器中启动、或者通过[YARN]({% link deployment/resource-providers/yarn.zh.md %})或[Mesos]({% link deployment/resource-providers/mesos.zh.md %})等资源框架管理启动。TaskManager 连接到 JobManagers,宣布自己可用,并被分配工作。 ### JobManager -The _JobManager_ has a number of responsibilities related to coordinating the distributed execution of Flink Applications: -it decides when to schedule the next task (or set of tasks), reacts to finished -tasks or execution failures, coordinates checkpoints, and coordinates recovery on -failures, among others. This process consists of three different components: +_JobManager_具有许多与协调 Flink 应用程序的分布式执行有关的职责:它决定何时调度下一个 task(或一组 task)、对完成的 task 或执行失败做出反应、协调 checkpoint、并且协调从失败中恢复等等。这个进程由三个不同的组件组成: * **ResourceManager** - The _ResourceManager_ is responsible for resource de-/allocation and - provisioning in a Flink cluster — it manages **task slots**, which are the - unit of resource scheduling in a Flink cluster (see [TaskManagers](#taskmanagers)). - Flink implements multiple ResourceManagers for different environments and - resource providers such as YARN, Mesos, Kubernetes and standalone - deployments. In a standalone setup, the ResourceManager can only distribute - the slots of available TaskManagers and cannot start new TaskManagers on - its own. + _ResourceManager_负责 Flink 集群中的资源删除/分配和供应 - 它管理 **task slots**,这是 Flink 集群中资源调度的单位(请参考[TaskManagers](#taskmanagers))。Flink 为不同的环境和资源提供者(例如 YARN、Mesos、Kubernetes 和 standalone 部署)实现了多个 ResourceManager。在 standalone 设置中,ResourceManager 只能分配可用 TaskManager 的 slots,而不能自行启动新的 TaskManager。 * **Dispatcher** - The _Dispatcher_ provides a REST interface to submit Flink applications for - execution and starts a new JobMaster for each submitted job. It - also runs the Flink WebUI to provide information about job executions. + _Dispatcher_ 提供了一个 REST 接口,用来提交 Flink 应用程序执行,并为每个提交的作业启动一个新的 JobMaster。它还运行 Flink WebUI 用来提供作业执行信息。 * **JobMaster** - A _JobMaster_ is responsible for managing the execution of a single - [JobGraph]({% link concepts/glossary.zh.md %}#logical-graph). - Multiple jobs can run simultaneously in a Flink cluster, each having its - own JobMaster. + _JobMaster_ 负责管理单个[JobGraph]({% link concepts/glossary.zh.md %}#logical-graph)的执行。Flink 集群中可以同时运行多个作业,每个作业都有自己的 JobMaster。 -There is always at least one JobManager. A high-availability setup might have -multiple JobManagers, one of which is always the *leader*, and the others are -*standby* (see [High Availability (HA)]({% link deployment/ha/index.zh.md %})). +始终至少有一个 JobManager。高可用设置中可能有多个 JobManager,其中一个始终是 *leader*,其他的则是 *standby*(请参考 [高可用(HA)]({% link deployment/ha/index.zh.md %}))。 ### TaskManagers -The *TaskManagers* (also called *workers*) execute the tasks of a dataflow, and buffer and exchange the data -streams. +*TaskManager*(也称为 *worker*)执行数据流的 task,并且缓存和交换数据流。 Review comment: ```suggestion *TaskManagers*(也称为 *workers*)执行一个 dataflow 的 tasks,并且缓存和交换数据流。 ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org