[ https://issues.apache.org/jira/browse/FLINK-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15305118#comment-15305118 ]
ASF GitHub Bot commented on FLINK-3667: --------------------------------------- Github user EronWright commented on the pull request: https://github.com/apache/flink/pull/1978#issuecomment-222285524 This PR dovetails nicely with the Mesos work and I'll be sure to build on it. Here's a few suggestions to align it even further. The problem of _managing_ a Flink cluster is mostly independent from _using_ a cluster to submit and manage jobs. I would like to see the two concerns be cleanly separated. In this PR, the `ClusterDescriptor` handles creating the cluster, then produces a `Client` with which to manage jobs and to handle shutdown. I suggest that a new component - the `YarnDispatcher` - be introduced to handle all lifecycle operations for a cluster. Make the `ClusterDescriptor` be an entity class that is given to the dispatcher. A related issue is that its only possible to use the `YarnClusterClient` to interact with a newly-created YARN session, not a pre-existing one. When submitting a job to an existing YARN session, seems the `StandaloneClusterClient` is used (by supplying a JM endpoint) - is that true? Eventually the CLI should provide a nice way to discover and use existing YARN sessions. The `detached` flags could use clarification. In the `Client` context, the detached concept seems related to interactivity with the job (tailing the status messages, etc). I don't think it should imply anything about the lifecycle of the cluster; leave that to the dispatcher. The `stopAfterJob` method should move accordingly to the dispatcher. How this relates to Mesos is, the `MesosDispatcher` component will run in the Mesos cluster and be accessed remotely by the CLI. The `ClusterDescriptor` will be passed via REST to it. Everything will fit nicely. :) > Generalize client<->cluster communication > ----------------------------------------- > > Key: FLINK-3667 > URL: https://issues.apache.org/jira/browse/FLINK-3667 > Project: Flink > Issue Type: Improvement > Components: YARN Client > Reporter: Maximilian Michels > Assignee: Maximilian Michels > > Here are some notes I took when inspecting the client<->cluster classes with > regard to future integration of other resource management frameworks in > addition to Yarn (e.g. Mesos). > {noformat} > 1 Cluster Client Abstraction > ════════════════════════════ > 1.1 Status Quo > ────────────── > 1.1.1 FlinkYarnClient > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Holds the cluster configuration (Flink-specific and Yarn-specific) > • Contains the deploy() method to deploy the cluster > • Creates the Hadoop Yarn client > • Receives the initial job manager address > • Bootstraps the FlinkYarnCluster > 1.1.2 FlinkYarnCluster > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Wrapper around the Hadoop Yarn client > • Queries cluster for status updates > • Life time methods to start and shutdown the cluster > • Flink specific features like shutdown after job completion > 1.1.3 ApplicationClient > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Acts as a middle-man for asynchronous cluster communication > • Designed to communicate with Yarn, not used in Standalone mode > 1.1.4 CliFrontend > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Deeply integrated with FlinkYarnClient and FlinkYarnCluster > • Constantly distinguishes between Yarn and Standalone mode > • Would be nice to have a general abstraction in place > 1.1.5 Client > ╌╌╌╌╌╌╌╌╌╌╌╌ > • Job submission and Job related actions, agnostic of resource framework > 1.2 Proposal > ──────────── > 1.2.1 ClusterConfig (before: AbstractFlinkYarnClient) > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Extensible cluster-agnostic config > • May be extended by specific cluster, e.g. YarnClusterConfig > 1.2.2 ClusterClient (before: AbstractFlinkYarnClient) > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Deals with cluster (RM) specific communication > • Exposes framework agnostic information > • YarnClusterClient, MesosClusterClient, StandaloneClusterClient > 1.2.3 FlinkCluster (before: AbstractFlinkYarnCluster) > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Basic interface to communicate with a running cluster > • Receives the ClusterClient for cluster-specific communication > • Should not have to care about the specific implementations of the > client > 1.2.4 ApplicationClient > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • Can be changed to work cluster-agnostic (first steps already in > FLINK-3543) > 1.2.5 CliFrontend > ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ > • CliFrontend does never have to differentiate between different > cluster types after it has determined which cluster class to load. > • Base class handles framework agnostic command line arguments > • Pluggables for Yarn, Mesos handle specific commands > {noformat} > I would like to create/refactor the affected classes to set us up for a more > flexible client side resource management abstraction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)