Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-30 Thread Matthias Pohl
Thanks for sharing. I was wondering why you don't use $PORT0 in your command. And: Are the ports properly configured in the Marathon network configuration [1]? But the error seems to be unrelated to that setting. Other than that, I cannot see any other issue with the configuration. It could be that

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Matthias Pohl
...and if possible, it would be helpful to provide debug logs as well. On Wed, Sep 29, 2021 at 6:33 PM Matthias Pohl wrote: > May you provide the entire JobManager logs so that we can see what's going > on? > > On Wed, Sep 29, 2021 at 12:42 PM Javier Vegas wrote: > >> Thanks again, Matthias! >>

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Matthias Pohl
May you provide the entire JobManager logs so that we can see what's going on? On Wed, Sep 29, 2021 at 12:42 PM Javier Vegas wrote: > Thanks again, Matthias! > > Putting -Djobmanager.rpc.address=$HOST and -Djobmanager.rpc.port=$PORT0 > as params for appmaster.sh > I see in tog they seem to tra

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Javier Vegas
Thanks again, Matthias! Putting -Djobmanager.rpc.address=$HOST and -Djobmanager.rpc.port=$PORT0 as params for appmaster.sh I see in tog they seem to transform in the correct values -Djobmanager.rpc.address=10.0.23.35 -Djobmanager.rpc.port=31009 but a bit later the appmaster dies with this new

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Matthias Pohl
The port has its separate configuration parameter jobmanager.rpc.port [1] [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#jobmanager-rpc-port-1 On Wed, Sep 29, 2021 at 10:11 AM Javier Vegas wrote: > Matthias, thanks for the suggestion! I changed my jobmanager.

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Javier Vegas
Matthias, thanks for the suggestion! I changed my jobmanager.rpc.address param from $HOSTNAME to $HOST:$PORT0 which in the log I see resolves properly to the host IP and port mapped to 8081 2021-09-29 07:58:05.452 [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Djobmanager.r

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-29 Thread Matthias Pohl
One thing that was puzzling me yesterday when reading your post: Have you tried $HOST instead of $HOSTNAME in the Marathon configuration? When I played around with Mesos, I remember using HOST to resolve the host's IP address instead of the host's name. It could be that the hostname itself cannot b

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Javier Vegas
Another update: Looking more carefully in my appmaster log, I see the following 2021-09-29 01:15:39.680 [flink-akka.actor.default-dispatcher-3] INFO o.a.f.m.runtime.clusterframework.MesosResourceManagerDriver - Registering as new framework. 2021-09-29 01:15:39.680 [flink-akka.actor.default-disp

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Javier Vegas
Thanks, Matthias! There are lots of apps deployed to the Mesos cluster, the task manager itself is deployed to Mesos via Marathon. In the Mesos log I can see the Job manager agent starting, but no error messages related to it. As you say, TaskManagers don't even have the chance to get confused ab

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Javier Vegas
Thanks, Roman! Looking at the log, seems that the TaskManager can resolve $HOSTNAME to its own hostname (07a6b681ee0f), as seen in these lines: 2021-09-27 22:02:41.067 [main] INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Djobmanager.rpc.address=*07a6b681ee0f* 2021-09-27 22:02:43

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Matthias Pohl
Hi Javier, I don't see anything that's configured in the wrong way based on the jobmanager logs you've provided. Have you been able to deploy other applications to this Mesos cluster? Do the Mesos master logs reveal anything? The variable resolution on the TaskManager side is a valid concern shared

Re: Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-28 Thread Roman Khachatryan
Hi, No additional ports need to be open as far as I know. Probably, $HOSTNAME is substituted for something not resolvable on TMs? Please also make sure that the following gets executed before mesos-appmaster.sh: export HADOOP_CLASSPATH=$(hadoop classpath) export MESOS_NATIVE_JAVA_LIBRARY=/path/t

Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-27 Thread Javier Vegas
I am trying to start Flink 1.13.2 on Mesos following the instrucions in https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/mesos/ and using Marathon to deploy a Docker image with both the Flink and my binaries. My entrypoint for the Docker image is: /op