Adding metric-query port makes it a bit better, but there is still an error


019-02-22 00:03:56,173 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor 
           - Could not resolve ResourceManager address 
akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/user/resourcemanager, 
retrying in 10000 ms: Ask timed out on 
[ActorSelection[Anchor(akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/),
 Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of 
type "akka.actor.Identify"..
2019-02-22 00:04:16,213 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not 
resolve ResourceManager address 
akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/user/resourcemanager, 
retrying in 10000 ms: Ask timed out on 
[ActorSelection[Anchor(akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/),
 Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of 
type "akka.actor.Identify"..
2019-02-22 00:04:36,253 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not 
resolve ResourceManager address 
akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/user/resourcemanager, 
retrying in 10000 ms: Ask timed out on 
[ActorSelection[Anchor(akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/),
 Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of 
type "akka.actor.Identify"..
2019-02-22 00:04:56,293 INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not 
resolve ResourceManager address 
akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/user/resourcemanager, 
retrying in 10000 ms: Ask timed out on 
[ActorSelection[Anchor(akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/),
 Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of 
type "akka.actor.Identify”..

In the task manager and

2019-02-21 23:59:46,479 ERROR akka.remote.EndpointWriter                        
            - dropping message [class akka.actor.ActorSelectionMessage] for 
non-local recipient 
[Actor[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/]] arriving at 
[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123] inbound addresses are 
[akka.tcp://flink@127.0.0.1:6123]
2019-02-21 23:59:57,808 ERROR akka.remote.EndpointWriter                        
            - dropping message [class akka.actor.ActorSelectionMessage] for 
non-local recipient 
[Actor[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/]] arriving at 
[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123] inbound addresses are 
[akka.tcp://flink@127.0.0.1:6123]
2019-02-22 00:00:06,519 ERROR akka.remote.EndpointWriter                        
            - dropping message [class akka.actor.ActorSelectionMessage] for 
non-local recipient 
[Actor[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/]] arriving at 
[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123] inbound addresses are 
[akka.tcp://flink@127.0.0.1:6123]
2019-02-22 00:00:17,849 ERROR akka.remote.EndpointWriter                        
            - dropping message [class akka.actor.ActorSelectionMessage] for 
non-local recipient 
[Actor[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/]] arriving at 
[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123] inbound addresses are 
[akka.tcp://flink@127.0.0.1:6123]
2019-02-22 00:00:26,558 ERROR akka.remote.EndpointWriter                        
            - dropping message [class akka.actor.ActorSelectionMessage] for 
non-local recipient 
[Actor[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/]] arriving at 
[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123] inbound addresses are 
[akka.tcp://flink@127.0.0.1:6123]
2019-02-22 00:00:37,888 ERROR akka.remote.EndpointWriter                        
            - dropping message [class akka.actor.ActorSelectionMessage] for 
non-local recipient 
[Actor[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/]] arriving at 
[akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123] inbound addresses are 
[akka.tcp://flink@127.0.0.1:6123]

I the job manager

Port 6123 is opened in both Job Manager deployment

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: {{ template "fullname" . }}-jobmanager
spec:
  replicas: 1
  template:
    metadata:
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '9249'
      labels:
        server: flink
        app: {{ template "fullname" . }}
        component: jobmanager
    spec:
      containers:
      - name: jobmanager
        image: {{ .Values.image }}:{{ .Values.imageTag }}
        imagePullPolicy: {{ .Values.imagePullPolicy }}
        args:
        - jobmanager
        ports:
        - containerPort: 6123
          name: rpc
        - containerPort: 6124
          name: blob
        - containerPort: 8081
          name: ui
        env:
        - name: CONTAINER_METRIC_PORT
          value: '{{ .Values.flink.metric_query_port }}'
        - name: JOB_MANAGER_RPC_ADDRESS
          value : {{ template "fullname" . }}-jobmanager
        livenessProbe:
          httpGet:
            path: /overview
            port: 8081
          initialDelaySeconds: 30
          periodSeconds: 10
        resources:
          limits:
            cpu: {{ .Values.resources.jobmanager.limits.cpu }}
            memory: {{ .Values.resources.jobmanager.limits.memory }}
          requests:
            cpu: {{ .Values.resources.jobmanager.requests.cpu }}
            memory: {{ .Values.resources.jobmanager.requests.memory }}

And Job manager service

apiVersion: v1
kind: Service
metadata:
  name: {{ template "fullname" . }}-jobmanager
spec:
  ports:
  - name: rpc
    port: 6123
  - name: blob
    port: 6124
  - name: ui
    port: 8081
  selector:
    app: {{ template "fullname" . }}
    component: jobmanager




Boris Lublinsky
FDP Architect
boris.lublin...@lightbend.com
https://www.lightbend.com/

> On Feb 21, 2019, at 6:13 PM, Boris Lublinsky <boris.lublin...@lightbend.com> 
> wrote:
> 
> 
> Boris Lublinsky
> FDP Architect
> boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com>
> https://www.lightbend.com/
> 
>> On Feb 21, 2019, at 2:05 AM, Konstantin Knauf <konstan...@ververica.com 
>> <mailto:konstan...@ververica.com>> wrote:
>> 
>> Hi Boris, 
>> 
>> the exact command depends on the docker-entrypoint.sh script and the image 
>> you are using. For the example contained in the Flink repository it is 
>> "task-manager", I think. The important thing is to pass "taskmanager.host" 
>> to the Taskmanager process. You can verify by checking the Taskmanager logs. 
>> These should contain lines like below:
>> 
>> 2019-02-21 08:03:00,004 INFO  
>> org.apache.flink.runtime.taskexecutor.TaskManagerRunner      [] -  Program 
>> Arguments:
>> 2019-02-21 08:03:00,008 INFO  
>> org.apache.flink.runtime.taskexecutor.TaskManagerRunner      [] -     
>> -Dtaskmanager.host=10.12.10.173
>> 
>> In the Jobmanager logs you should see that the Taskmanager is registered 
>> under the IP above in a line similar to:
>> 
>> 2019-02-21 08:03:26,874 INFO  
>> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - 
>> Registering TaskManager with ResourceID a0513ba2c472d2d1efc07626da9c1bda 
>> (akka.tcp://flink@10.12.10.173:46531/user/taskmanager_0 
>> <http://flink@10.12.10.173:46531/user/taskmanager_0>) at ResourceManager
>> 
>> A service per Taskmanager is not required. The purpose of the config 
>> parameter is that the Jobmanager addresses the taskmanagers by IP instead of 
>> hostname.
>> 
>> Hope this helps!
>> 
>> Cheers, 
>> 
>> Konstantin
>> 
>> 
>> 
>> On Wed, Feb 20, 2019 at 4:37 PM Boris Lublinsky 
>> <boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com>> wrote:
>> Also, The suggested workaround does not quite work.
>> 2019-02-20 15:27:43,928 WARN  akka.remote.ReliableDeliverySupervisor         
>>                - Association with remote system 
>> [akka.tcp://flink-metrics@flink-taskmanager-1:6170 <>] has failed, address 
>> is now gated for [50] ms. Reason: [Association failed with 
>> [akka.tcp://flink-metrics@flink-taskmanager-1:6170 <>]] Caused by: 
>> [flink-taskmanager-1: No address associated with hostname]
>> 2019-02-20 15:27:48,750 ERROR 
>> org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHandler  
>> - Caught exception
>> 
>> I think the problem is that its trying to connect to flink-task-manager-1
>> 
>> Using busybody to experiment with nslookup, I can see
>> / # nslookup flink-taskmanager-1.flink-taskmanager
>> Server:    10.0.11.151
>> Address 1: 10.0.11.151 ip-10-0-11-151.us 
>> <http://ip-10-0-11-151.us/>-west-2.compute.internal
>> 
>> Name:      flink-taskmanager-1.flink-taskmanager
>> Address 1: 10.131.2.136 
>> flink-taskmanager-1.flink-taskmanager.flink.svc.cluster.local
>> / # nslookup flink-taskmanager-1
>> Server:    10.0.11.151
>> Address 1: 10.0.11.151 ip-10-0-11-151.us 
>> <http://ip-10-0-11-151.us/>-west-2.compute.internal
>> 
>> nslookup: can't resolve 'flink-taskmanager-1'
>> / # nslookup flink-taskmanager-0.flink-taskmanager
>> Server:    10.0.11.151
>> Address 1: 10.0.11.151 ip-10-0-11-151.us 
>> <http://ip-10-0-11-151.us/>-west-2.compute.internal
>> 
>> Name:      flink-taskmanager-0.flink-taskmanager
>> Address 1: 10.131.0.111 
>> flink-taskmanager-0.flink-taskmanager.flink.svc.cluster.local
>> / # nslookup flink-taskmanager-0
>> Server:    10.0.11.151
>> Address 1: 10.0.11.151 ip-10-0-11-151.us 
>> <http://ip-10-0-11-151.us/>-west-2.compute.internal
>> 
>> nslookup: can't resolve 'flink-taskmanager-0'
>> / # 
>> 
>> So the name should be postfixed with the service name. How do I force it? I 
>> suspect I am missing config parameter
>> 
>>  
>> Boris Lublinsky
>> FDP Architect
>> boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com>
>> https://www.lightbend.com/ <https://www.lightbend.com/>
>>> On Feb 19, 2019, at 4:33 AM, Konstantin Knauf <konstan...@ververica.com 
>>> <mailto:konstan...@ververica.com>> wrote:
>>> 
>>> Hi Boris, 
>>> 
>>> the solution is actually simpler than it sounds from the ticket. The only 
>>> thing you need to do is to set the "taskmanager.host" to the Pod's IP 
>>> address in the Flink configuration. The easiest way to do this is to pass 
>>> this config dynamically via a command-line parameter. 
>>> 
>>> The Deployment spec could looks something like this:
>>> containers:
>>> - name: taskmanager
>>>   [...]
>>>   args:
>>>   - "taskmanager.sh"
>>>   - "start-foreground"
>>>   - "-Dtaskmanager.host=$(K8S_POD_IP)"
>>>   [...]
>>>   env:
>>>   - name: K8S_POD_IP
>>>     valueFrom:
>>>       fieldRef:
>>>         fieldPath: status.podIP
>>> 
>>> Hope this helps and let me know if this works. 
>>> 
>>> Best, 
>>> 
>>> Konstantin
>>> 
>>> On Sun, Feb 17, 2019 at 9:51 PM Boris Lublinsky 
>>> <boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com>> 
>>> wrote:
>>> I was looking at this issue 
>>> https://issues.apache.org/jira/browse/FLINK-11127 
>>> <https://issues.apache.org/jira/browse/FLINK-11127>
>>> Apparently there is a workaround for it.
>>> Is it possible provide the complete helm chart for it.
>>> Bits and pieces are in the ticket, but it would be nice to see the full 
>>> chart
>>> 
>>> Boris Lublinsky
>>> FDP Architect
>>> boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com>
>>> https://www.lightbend.com/ <https://www.lightbend.com/>
>>> 
>>> 
>>> -- 
>>> Konstantin Knauf | Solutions Architect
>>> +49 160 91394525
>>> 
>>>  <https://www.ververica.com/>
>>> Follow us @VervericaData
>>> --
>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink 
>>> Conference
>>> Stream Processing | Event Driven | Real Time
>>> --
>>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>> --
>>> Data Artisans GmbH
>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen    
>> 
>> 
>> 
>> -- 
>> Konstantin Knauf | Solutions Architect
>> +49 160 91394525
>>  <https://www.ververica.com/>
>> Follow us @VervericaData
>> --
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference
>> Stream Processing | Event Driven | Real Time
>> --
>> Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>> --
>> Data Artisans GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen    
> 

Reply via email to