[ https://issues.apache.org/jira/browse/FLINK-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958984#comment-14958984 ]
ASF GitHub Bot commented on FLINK-1984: --------------------------------------- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/948#issuecomment-148403603 I tried running the code from this pull request again, this time using the `mesos-playa` vagrant image, and it does not work for me. I was following your instructions. When did you test the changes recently? My motivation to test this pull request goes down every time I'm testing it. I've spun up a Mesos cluster on GCE two times, plus the VM now. Maybe I'm doing it wrong, please let me know what I can do to get it to run. CLI output: ``` vagrant@mesos:~/flink/build-target$ java -Dlog4j.configuration=file://`pwd`/conf/log4j.properties -Dlog.file=logs.log -cp lib/flink-dist-0.10-SNAPSHOT.jar org.apache.flink.mesos.scheduler.FlinkScheduler --confDir conf/ I1015 14:05:01.591161 9992 sched.cpp:157] Version: 0.22.1 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@716: Client environment:host.name=mesos 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@723: Client environment:os.name=Linux 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@724: Client environment:os.arch=3.16.0-30-generic 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@725: Client environment:os.version=#40~14.04.1-Ubuntu SMP Thu Jan 15 17:43:14 UTC 2015 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@733: Client environment:user.name=vagrant 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@741: Client environment:user.home=/home/vagrant 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@log_env@753: Client environment:user.dir=/home/vagrant/flink/flink-dist/target/flink-0.10-SNAPSHOT-bin/flink-0.10-SNAPSHOT 2015-10-15 14:05:01,592:9991(0x7f67cffff700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=127.0.0.1:2181 sessionTimeout=10000 watcher=0x7f67dac33a60 sessionId=0 sessionPasswd=<null> context=0x7f67f0004470 flags=0 2015-10-15 14:05:01,592:9991(0x7f67c6ffd700):ZOO_INFO@check_events@1703: initiated connection to server [127.0.0.1:2181] Embedded server listening at http://127.0.0.1:40815 Press any key to stop. 2015-10-15 14:05:04,959:9991(0x7f67c6ffd700):ZOO_INFO@check_events@1750: session establishment complete on server [127.0.0.1:2181], sessionId=0x1506b6312fa000b, negotiated timeout=10000 I1015 14:05:04.959841 10024 group.cpp:313] Group process (group(1)@127.0.1.1:57437) connected to ZooKeeper I1015 14:05:04.959899 10024 group.cpp:790] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I1015 14:05:04.959928 10024 group.cpp:385] Trying to create path '/mesos' in ZooKeeper I1015 14:05:05.204282 10024 detector.cpp:138] Detected a new leader: (id='2') I1015 14:05:05.204489 10024 group.cpp:659] Trying to get '/mesos/info_0000000002' in ZooKeeper I1015 14:05:05.303072 10024 detector.cpp:452] A new leading master (UPID=master@127.0.1.1:5050) is detected I1015 14:05:05.303467 10024 sched.cpp:254] New master detected at master@127.0.1.1:5050 I1015 14:05:05.303890 10024 sched.cpp:264] No credentials provided. Attempting to register without authentication I1015 14:05:05.851562 10024 sched.cpp:448] Framework registered with 20151015-120419-16842879-5050-1244-0000 ``` log file content ``` 14:04:54,564 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - -------------------------------------------------------------------------------- 14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Starting JobManager (Version: 0.10-SNAPSHOT, Rev:d905af0, Date:06.10.2015 @ 19:37:22 UTC) 14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Current user: vagrant 14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.7/24.79-b02 14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Maximum heap size: 592 MiBytes 14:04:55,763 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - JAVA_HOME: (not set) 14:04:55,823 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Hadoop version: 2.3.0 14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - JVM Options: 14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - -Dlog4j.configuration=file:///home/vagrant/flink/build-target/conf/log4j.properties 14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - -Dlog.file=logs.log 14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Program Arguments: 14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - --confDir 14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - conf/ 14:04:55,824 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - -------------------------------------------------------------------------------- 14:04:55,875 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Maximum number of open file descriptors is 4096 14:04:55,875 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Loading configuration from /home/vagrant/flink/flink-dist/target/flink-0.10-SNAPSHOT-bin/flink-0.10-SNAPSHOT/conf 14:04:58,375 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager 14:04:58,377 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager actor system at localhost:6123. 14:04:59,700 INFO org.eclipse.jetty.util.log - jetty-0.10-SNAPSHOT 14:05:01,985 INFO org.eclipse.jetty.util.log - Started SocketConnector@127.0.0.1:40815 14:05:07,698 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started 14:05:07,750 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Accepting 14:05:07,960 INFO Remoting - Starting remoting 14:05:09,241 INFO Remoting - Remoting started; listening on addresses :[akka.tcp://flink@127.0.0.1:6123] 14:05:09,248 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager actor 14:05:09,597 INFO org.apache.flink.runtime.blob.BlobServer - Created BLOB server storage directory /tmp/blobStore-9b7614f7-7d0d-4c5e-b4c6-911f0ab845ef 14:05:09,597 INFO org.apache.flink.runtime.blob.BlobServer - Started BLOB server at 0.0.0.0:40000 - max concurrent requests: 50 - max backlog: 1000 14:05:10,470 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager at akka.tcp://flink@127.0.0.1:6123/user/jobmanager. 14:05:10,471 INFO org.apache.flink.runtime.jobmanager.MemoryArchivist - Started memory archivist akka://flink/user/archive 14:05:10,563 INFO org.apache.flink.runtime.jobmanager.JobManager - JobManager akka.tcp://flink@127.0.0.1:6123/user/jobmanager was granted leadership with leader session ID None. 14:05:10,593 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManger web frontend 14:05:10,735 INFO org.apache.flink.runtime.jobmanager.web.WebInfoServer - Setting up web info server, using web-root directory jar:file:/home/vagrant/flink/flink-dist/target/flink-0.10-SNAPSHOT-bin/flink-0.10-SNAPSHOT/lib/flink-dist-0.10-SNAPSHOT.jar!/web-docs-infoserver. 14:05:11,162 INFO org.eclipse.jetty.util.log - jetty-0.10-SNAPSHOT 14:05:11,165 INFO org.eclipse.jetty.util.log - Started SelectChannelConnector@0.0.0.0:8081 14:05:11,166 INFO org.apache.flink.runtime.jobmanager.web.WebInfoServer - Started web info server for JobManager on 0.0.0.0:8081 14:05:14,936 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Declining offer(s) from slave 20151015-120419-16842879-5050-1244-S0 offered [cpus: 1.5 | mem : 488.0 | disk: 33044.0] required [cpus: 0.5 | mem: 512.0 | disk: 1024.0] 14:05:15,948 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - statusUpdate received from taskId: TaskManager_1 slaveId: 20151015-120419-16842879-5050-1244-S0 [TASK_LOST] 14:05:15,948 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Lost taskManager with TaskId: TaskManager_1 on slave: 20151015-120419-16842879-5050-1244-S0 14:05:16,939 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Accepting 14:05:17,092 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - statusUpdate received from taskId: TaskManager_2 slaveId: 20151015-120419-16842879-5050-1244-S0 [TASK_LOST] 14:05:17,092 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Lost taskManager with TaskId: TaskManager_2 on slave: 20151015-120419-16842879-5050-1244-S0 14:05:17,939 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Accepting 14:05:18,096 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - statusUpdate received from taskId: TaskManager_3 slaveId: 20151015-120419-16842879-5050-1244-S0 [TASK_LOST] 14:05:18,096 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Lost taskManager with TaskId: TaskManager_3 on slave: 20151015-120419-16842879-5050-1244-S0 14:05:18,940 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Accepting 14:05:19,112 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - statusUpdate received from taskId: TaskManager_4 slaveId: 20151015-120419-16842879-5050-1244-S0 [TASK_LOST] 14:05:19,113 INFO org.apache.flink.mesos.scheduler.FlinkScheduler$ - Lost taskManager with TaskId: TaskManager_4 on slave: 20151015-120419-16842879-5050-1244-S0 .... this goes on forever? ... ``` mesos file `mesos-slave.WARNING`: ``` Log file created at: 2015/10/15 12:04:40 Running on machine: mesos Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg W1015 12:04:40.464870 1310 slave.cpp:1934] Ignoring updating pid for framework 20151007-005549-16842879-5050-1191-0001 because it does not exist W1015 12:05:08.030145 1313 slave.cpp:1934] Ignoring updating pid for framework 20151007-005549-16842879-5050-1191-0000 because it does not exist E1015 14:05:14.378486 1312 slave.cpp:3112] Container '74dc3694-16ec-470f-88c6-b06b7f295682' for executor 'executor_1' of framework '20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs for container '74dc3694-16ec-470f-88c6-b06b7f295682'with exit status: 256 E1015 14:05:15.768391 1315 slave.cpp:3461] Failed to unmonitor container for executor executor_1 of framework 20151015-120419-16842879-5050-1244-0000: Not monitored W1015 14:05:15.851459 1312 containerizer.cpp:814] Ignoring update for unknown container: 74dc3694-16ec-470f-88c6-b06b7f295682 E1015 14:05:16.989680 1307 slave.cpp:3112] Container '2af2d3c0-e30c-4405-9ff1-7f4389bb62e9' for executor 'executor_2' of framework '20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs for container '2af2d3c0-e30c-4405-9ff1-7f4389bb62e9'with exit status: 256 E1015 14:05:17.090631 1312 slave.cpp:3461] Failed to unmonitor container for executor executor_2 of framework 20151015-120419-16842879-5050-1244-0000: Not monitored W1015 14:05:17.091418 1305 containerizer.cpp:814] Ignoring update for unknown container: 2af2d3c0-e30c-4405-9ff1-7f4389bb62e9 E1015 14:05:17.993669 1310 slave.cpp:3112] Container '8cbc46f8-3200-4f9b-9134-099a0f6f3541' for executor 'executor_3' of framework '20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs for container '8cbc46f8-3200-4f9b-9134-099a0f6f3541'with exit status: 256 E1015 14:05:18.095177 1310 slave.cpp:3461] Failed to unmonitor container for executor executor_3 of framework 20151015-120419-16842879-5050-1244-0000: Not monitored W1015 14:05:18.095211 1310 containerizer.cpp:814] Ignoring update for unknown container: 8cbc46f8-3200-4f9b-9134-099a0f6f3541 E1015 14:05:19.006584 1305 slave.cpp:3112] Container 'aca9e80a-5a34-4c29-a123-f025dc4946fe' for executor 'executor_4' of framework '20151015-120419-16842879-5050-1244-0000' failed to start: Failed to fetch URIs for container 'aca9e80a-5a34-4c29-a123-f025dc4946fe'with exit status: 256 ``` I can not find any log files for the taskamanger > Integrate Flink with Apache Mesos > --------------------------------- > > Key: FLINK-1984 > URL: https://issues.apache.org/jira/browse/FLINK-1984 > Project: Flink > Issue Type: New Feature > Components: New Components > Reporter: Robert Metzger > Priority: Minor > Attachments: 251.patch > > > There are some users asking for an integration of Flink into Mesos. > There also is a pending pull request for adding Mesos support for Flink: > https://github.com/apache/flink/pull/251 > But the PR is insufficiently tested. I'll add the code of the pull request to > this JIRA in case somebody wants to pick it up in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)