Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/3236#discussion_r99063747 --- Diff: docs/setup/mesos.md --- @@ -145,60 +174,75 @@ If set to 'docker', specify the image name: In the `/bin` directory of the Flink distribution, you find two startup scripts which manage the Flink processes in a Mesos cluster: -1. mesos-appmaster.sh - This starts the Mesos application master which will register the Mesos - scheduler. It is also responsible for starting up the worker nodes. +1. `mesos-appmaster.sh` + This starts the Mesos application master which will register the Mesos scheduler. + It is also responsible for starting up the worker nodes. -2. mesos-taskmanager.sh - The entry point for the Mesos worker processes. You don't need to explicitly - execute this script. It is automatically launched by the Mesos worker node to - bring up a new TaskManager. +2. `mesos-taskmanager.sh` + The entry point for the Mesos worker processes. + You don't need to explicitly execute this script. + It is automatically launched by the Mesos worker node to bring up a new TaskManager. + +In order to run the `mesos-appmaster.sh` script you have to define `mesos.master` in the `flink-conf.yaml` or pass it via `-Dmesos.master=...` to the Java process. +Additionally, you should define the number of task managers which are started by Mesos via `mesos.initial-tasks`. +This value can also be defined in the `flink-conf.yaml` or passed as a Java property. + +When executing `mesos-appmaster.sh`, it will create a job manager on the machine where you executed the script. +In contrast to that, the task managers will be run as Mesos tasks in the Mesos cluster. + +#### General configuration + +It is possible to completely parameterize a Mesos application through Java properties passed to the Mesos application master. +This also allows to specify general Flink configuration parameters. +For example: + + bin/mesos-appmaster.sh \ + -Dmesos.master=master.foobar.org:5050 + -Djobmanager.heap.mb=1024 \ + -Djobmanager.rpc.port=6123 \ + -Djobmanager.web.port=8081 \ + -Dmesos.initial-tasks=10 \ + -Dmesos.resourcemanager.tasks.mem=4096 \ + -Dtaskmanager.heap.mb=3500 \ + -Dtaskmanager.numberOfTaskSlots=2 \ + -Dparallelism.default=10 ### High Availability -You will need to run a service like Marathon or Apache Aurora which takes care -of restarting the Flink master process in case of node or process failures. In -addition, Zookeeper needs to be configured like described in the -[High Availability section of the Flink docs]({{ site.baseurl }}/setup/jobmanager_high_availability.html) +You will need to run a service like Marathon or Apache Aurora which takes care of restarting the Flink master process in case of node or process failures. +In addition, Zookeeper needs to be configured like described in the [High Availability section of the Flink docs]({{ site.baseurl }}/setup/jobmanager_high_availability.html) -For the reconciliation of tasks to work correctly, please also set -`recovery.zookeeper.path.mesos-workers` to a valid Zookeeper path. +For the reconciliation of tasks to work correctly, please also set `recovery.zookeeper.path.mesos-workers` to a valid Zookeeper path. #### Marathon -Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. In -particular, it should also adjust any configuration parameters for the Flink -cluster. +Marathon needs to be set up to launch the `bin/mesos-appmaster.sh` script. +In particular, it should also adjust any configuration parameters for the Flink cluster. Here is an example configuration for Marathon: { - "id": "basic-0", - "cmd": "$FLINK_HOME/bin/mesos-appmaster.sh -DconfigEntry=configValue -DanotherEntry=anotherValue ...", + "id": "flink", + "cmd": "$FLINK_HOME/bin/mesos-appmaster.sh -Djobmanager.heap.mb=1024 -Djobmanager.rpc.port=6123 -Djobmanager.web.port=8081 -Dmesos.initial-tasks=1 -Dmesos.resourcemanager.tasks.mem=1024 -Dtaskmanager.heap.mb=1024 -Dtaskmanager.numberOfTaskSlots=2 -Dparallelism.default=2 -Dmesos.resourcemanager.tasks.cpus=1", "cpus": 1.0, - "mem": 2048, + "mem": 1024 } +When running Flink with Marathon, the whole Flink cluster including the job manager will be run as Mesos tasks in the Mesos cluster. + ### Configuration parameters #### Mesos configuration entries +`mesos.initial-tasks`: The initial workers to bring up when the master starts (**DEFAULT**: The number of workers specified at cluster startup). -`mesos.initial-tasks`: The initial workers to bring up when the master - starts (**DEFAULT**: The number of workers specified at cluster startup). - -`mesos.maximum-failed-tasks`: The maximum number of failed workers before - the cluster fails (**DEFAULT**: Number of initial workers) May be set to -1 - to disable this feature. +`mesos.maximum-failed-tasks`: The maximum number of failed workers before the cluster fails (**DEFAULT**: Number of initial workers). +May be set to -1 to disable this feature. -`mesos.master`: The Mesos master URL. The value should be in one of the - following forms: host:port, zk://host1:port1,host2:port2,.../path, - zk://username:password@host1:port1,host2:port2,.../path, - file:///path/to/file (where file contains one of the above) +`mesos.master`: The Mesos master URL. The value should be in one of the following forms: host:port, zk://host1:port1,host2:port2,.../path, zk://username:password@host1:port1,host2:port2,.../path, file:///path/to/file (where file contains one of the above) --- End diff -- I find this part a bit hard to read on the compiled docs. Perhaps use a bulleted list to show the different formats?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---