Hi, Alexei: What you paste is expected behavior. Jobmanager, two task managers each should run in a docker instance.
13276 is should be the process of job manager, and it's the same process as 789. They have different processes id because in show them in different namesapces(that's a concept in cgroup, which docker actually dependens on). On Thu, Jul 19, 2018 at 10:00 PM Till Rohrmann <trohrm...@apache.org> wrote: > Hi Alexei, > > I actually never used Mesos with container images. I always used it in a > way where the Mesos task directly starts the Java process. > > Cheers, > Till > > On Thu, Jul 19, 2018 at 2:44 PM NEKRASSOV, ALEXEI <an4...@att.com> wrote: > >> Till, >> >> >> >> Any insight into how Flink components are containerized in Mesos? >> >> >> >> Thanks! >> >> Alex >> >> >> >> *From:* Fabian Hueske [mailto:fhue...@gmail.com] >> *Sent:* Monday, July 16, 2018 7:57 AM >> *To:* NEKRASSOV, ALEXEI <an4...@att.com> >> *Cc:* user@flink.apache.org; Till Rohrmann <trohrm...@apache.org> >> *Subject:* Re: Flink on Mesos: containers question >> >> >> >> Hi Alexei, >> >> >> >> Till (in CC) is familiar with Flink's Mesos support in 1.4.x. >> >> >> >> Best, Fabian >> >> >> >> 2018-07-13 15:07 GMT+02:00 NEKRASSOV, ALEXEI <an4...@att.com>: >> >> Can someone please clarify how Flink on Mesos in containerized? >> >> >> >> On 5-node Mesos cluster I started Flink (1.4.2) with two Task Managers. >> Mesos shows “flink” task and two “taskmanager” tasks, all on the same VM. >> >> On that VM I see one Docker container running a process that seems to be >> Mesos App Master: >> >> >> >> $ docker ps -a >> >> CONTAINER ID IMAGE >> COMMAND CREATED STATUS >> PORTS NAMES >> >> 97b6840466c0 mesosphere/dcos-flink:1.4.2-1.0 "/bin/sh -c >> /sbin/..." 41 hours ago Up 41 hours >> mesos-a0079d85-9ccb-4c43-8d31-e6b1ad750197 >> >> $ docker exec 97b6840466c0 /bin/ps -efww >> >> UID PID PPID C STIME TTY TIME CMD >> >> root 1 0 0 Jul11 ? 00:00:00 /bin/sh -c /sbin/init.sh >> >> root 7 1 0 Jul11 ? 00:00:02 runsvdir -P /etc/service >> >> root 8 7 0 Jul11 ? 00:00:00 runsv flink >> >> root 629 0 0 Jul12 pts/0 00:00:00 /bin/bash >> >> root 789 8 1 Jul12 ? 00:09:16 >> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -classpath >> /flink-1.4.2/lib/flink-python_2.11-1.4.2.jar:/flink-1.4.2/lib/flink-shaded-hadoop2-uber-1.4.2.jar:/flink-1.4.2/lib/log4j-1.2.17.jar:/flink-1.4.2/lib/slf4j-log4j12-1.7.7.jar:/flink-1.4.2/lib/flink-dist_2.11-1.4.2.jar::/etc/hadoop/conf/: >> -Dlog.file=/mnt/mesos/sandbox/flink--mesos-appmaster-alex-tfc87d-private-agents-3.novalocal.log >> -Dlog4j.configuration=file:/flink-1.4.2/conf/log4j.properties >> -Dlogback.configurationFile=file:/flink-1.4.2/conf/logback.xml >> org.apache.flink.mesos.runtime.clusterframework.MesosApplicationMasterRunner >> -Dblob.server.port=23170 -Djobmanager.heap.mb=256 >> -Djobmanager.rpc.port=23169 -Djobmanager.web.port=23168 >> -Dmesos.artifact-server.port=23171 -Dmesos.initial-tasks=2 >> -Dmesos.resourcemanager.tasks.cpus=2 -Dmesos.resourcemanager.tasks.mem=2048 >> -Dtaskmanager.heap.mb=512 -Dtaskmanager.memory.preallocate=true >> -Dtaskmanager.numberOfTaskSlots=1 -Dparallelism.default=1 >> -Djobmanager.rpc.address=localhost -Dmesos.resourcemanager.framework.role=* >> -Dsecurity.kerberos.login.use-ticket-cache=true >> >> root 1027 0 0 12:54 ? 00:00:00 /bin/ps -efww >> >> >> >> Then on the VM itself I see another process with the same command line as >> the one in the container: >> >> >> >> root 13276 9689 1 Jul12 ? 00:09:18 >> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -classpath /flink >> -1.4.2/lib/flink-python_2.11-1.4.2.jar:/flink-1.4.2/lib/flink >> -shaded-hadoop2-uber-1.4.2.jar:/flink-1.4.2/lib/log4j-1.2.17.jar:/flink >> -1.4.2/lib/slf4j-log4j12-1.7.7.jar:/flink-1.4.2/lib/flink-dist_2.11-1.4.2.jar::/etc/hadoop/conf/: >> -Dlog.file=/mnt/mesos/sandbox/flink--mesos-appmaster-alex-tfc87d-private-agents-3.novalocal.log >> -Dlog4j.configuration=file:/flink-1.4.2/conf/log4j.properties >> -Dlogback.configurationFile=file:/flink-1.4.2/conf/logback.xml >> org.apache.flink.mesos.runtime.clusterframework.MesosApplicationMasterRunner >> -Dblob.server.port=23170 -Djobmanager.heap.mb=256 >> -Djobmanager.rpc.port=23169 -Djobmanager.web.port=23168 >> -Dmesos.artifact-server.port=23171 -Dmesos.initial-tasks=2 >> -Dmesos.resourcemanager.tasks.cpus=2 -Dmesos.resourcemanager.tasks.mem=2048 >> -Dtaskmanager.heap.mb=512 -Dtaskmanager.memory.preallocate=true >> -Dtaskmanager.numberOfTaskSlots=1 -Dparallelism.default=1 >> -Djobmanager.rpc.address=localhost -Dmesos.resourcemanager.framework.role=* >> -Dsecurity.kerberos.login.use-ticket-cache=true >> >> >> >> And I see two processes on the VM that seem to be related to Task >> Managers: >> >> >> >> root 13688 13687 0 Jul12 ? 00:04:25 >> /docker-java-home/jre/bin/java -Xms1448m -Xmx1448m -classpath >> /mnt/mesos/sandbox/flink/lib/flink >> -python_2.11-1.4.2.jar:/mnt/mesos/sandbox/flink/lib/flink >> -shaded-hadoop2-uber-1.4.2.jar:/mnt/mesos/sandbox/flink >> /lib/log4j-1.2.17.jar:/mnt/mesos/sandbox/flink >> /lib/slf4j-log4j12-1.7.7.jar:/mnt/mesos/sandbox/flink/lib/flink-dist_2.11-1.4.2.jar::: >> -Dlog.file=flink-taskmanager.log >> -Dlog4j.configuration=file:/mnt/mesos/sandbox/flink/conf/log4j.properties >> -Dlogback.configurationFile=file:/mnt/mesos/sandbox/flink/conf/logback.xml >> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager >> -Dblob.server.port=23170 -Dmesos.artifact-server.port=23171 >> -Djobmanager.heap.mb=256 -Djobmanager.rpc.address=localhost >> -Djobmanager.web.port=23168 -Dsecurity.kerberos.login.use-ticket-cache=true >> -Djobmanager.rpc.port=23169 -Dtaskmanager.memory.preallocate=true >> -Dtaskmanager.rpc.port=1027 -Dmesos.initial-tasks=2 >> -Dmesos.resourcemanager.tasks.cpus=2 >> -Dtaskmanager.maxRegistrationDuration=5 minutes >> -Dtaskmanager.data.port=1028 -Dparallelism.default=1 >> -Dtaskmanager.numberOfTaskSlots=1 -Dmesos.resourcemanager.tasks.mem=2048 >> -Dtaskmanager.heap.mb=512 -Dmesos.resourcemanager.framework.role=* >> >> root 13892 13891 0 Jul12 ? 00:04:15 >> /docker-java-home/jre/bin/java -Xms1448m -Xmx1448m -classpath >> /mnt/mesos/sandbox/flink/lib/flink >> -python_2.11-1.4.2.jar:/mnt/mesos/sandbox/flink/lib/flink >> -shaded-hadoop2-uber-1.4.2.jar:/mnt/mesos/sandbox/flink >> /lib/log4j-1.2.17.jar:/mnt/mesos/sandbox/flink >> /lib/slf4j-log4j12-1.7.7.jar:/mnt/mesos/sandbox/flink/lib/flink-dist_2.11-1.4.2.jar::: >> -Dlog.file=flink-taskmanager.log >> -Dlog4j.configuration=file:/mnt/mesos/sandbox/flink/conf/log4j.properties >> -Dlogback.configurationFile=file:/mnt/mesos/sandbox/flink/conf/logback.xml >> org.apache.flink.mesos.runtime.clusterframework.MesosTaskManager >> -Dblob.server.port=23170 -Dmesos.artifact-server.port=23171 >> -Djobmanager.heap.mb=256 -Djobmanager.rpc.address=localhost >> -Djobmanager.web.port=23168 -Dsecurity.kerberos.login.use-ticket-cache=true >> -Djobmanager.rpc.port=23169 -Dtaskmanager.memory.preallocate=true >> -Dtaskmanager.rpc.port=1025 -Dmesos.initial-tasks=2 >> -Dmesos.resourcemanager.tasks.cpus=2 >> -Dtaskmanager.maxRegistrationDuration=5 minutes >> -Dtaskmanager.data.port=1026 -Dparallelism.default=1 >> -Dtaskmanager.numberOfTaskSlots=1 -Dmesos.resourcemanager.tasks.mem=2048 >> -Dtaskmanager.heap.mb=512 -Dmesos.resourcemanager.framework.role=* >> >> >> >> But I don’t see any containers for Task Managers. >> >> >> >> I thought maybe Task Managers run directly on the VM (PID’s 13688, >> 13892), but my code executed in Task Managers have no access to VM’s >> filesystem. >> >> >> >> It is almost like there are more containers running than “docker ps” is >> showing me. Can someone clarify? >> >> Also, what is the relationship between PID 13276 and the process that I >> see in the container (the two processes with the same command line)? >> >> >> >> Thanks! >> >> Alex >> >> >> > -- Liu, Renjie Software Engineer, MVAD