Re: Jira issue Flink-11127

2019-02-22 Thread Boris Lublinsky
And it works now The problem was that I was setting jobmaneger.rest.address jobmanager.rpc.address that was creating actor system on the local host Although I am still getting the below messages in the job manager periodically, but they seem to be harmless ERROR org.apache.flink.runtime.rest

Re: Jira issue Flink-11127

2019-02-22 Thread Andrey Zagrebin
cc alek...@ververica.com On Fri, Feb 22, 2019 at 1:28 AM Boris Lublinsky < boris.lublin...@lightbend.com> wrote: > Adding metric-query port makes it a bit better, but there is still an error > > > 019-02-22 00:03:56,173 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor- Could

Re: Jira issue Flink-11127

2019-02-21 Thread Boris Lublinsky
Adding metric-query port makes it a bit better, but there is still an error 019-02-22 00:03:56,173 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not resolve ResourceManager address akka.tcp://flink@maudlin-ibis-fdp-flink-jobmanager:6123/user/resourcemanager, retry

Re: Jira issue Flink-11127

2019-02-21 Thread Boris Lublinsky
Boris Lublinsky FDP Architect boris.lublin...@lightbend.com https://www.lightbend.com/ > On Feb 21, 2019, at 2:05 AM, Konstantin Knauf > wrote: > > Hi Boris, > > the exact command depends on the docker-entrypoint.sh script and the image > you are using. For the example contained in the Flin

Re: Jira issue Flink-11127

2019-02-21 Thread Boris Lublinsky
Konstantin, it still does not quite work The IP is still in place, but… Here is Job manager log metrics.reporters: prom metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter metrics.reporter.prom.port: 9249 Starting Job Manager config file: jobmanager.rest.address:

Re: Jira issue Flink-11127

2019-02-21 Thread Konstantin Knauf
Hi Boris, the exact command depends on the docker-entrypoint.sh script and the image you are using. For the example contained in the Flink repository it is "task-manager", I think. The important thing is to pass "taskmanager.host" to the Taskmanager process. You can verify by checking the Taskmana

Re: Jira issue Flink-11127

2019-02-20 Thread Boris Lublinsky
Also, The suggested workaround does not quite work. 2019-02-20 15:27:43,928 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink-metrics@flink-taskmanager-1:6170] has failed, address is now gated for [50] ms. Reason: [Association

Re: Jira issue Flink-11127

2019-02-19 Thread Boris Lublinsky
Thanks Konstantin Unfortunately it does not work The snippet from task manager yaml is containers: - name: taskmanager image: {{ .Values.image }}:{{ .Values.imageTag }} imagePullPolicy: {{ .Values.imagePullPolicy }} args: - taskmanager -Dtaskmanager.host=$(K8S_POD_IP) ports: - name:

Re: Jira issue Flink-11127

2019-02-19 Thread Konstantin Knauf
Hi Boris, the solution is actually simpler than it sounds from the ticket. The only thing you need to do is to set the "taskmanager.host" to the Pod's IP address in the Flink configuration. The easiest way to do this is to pass this config dynamically via a command-line parameter. The Deployment

Jira issue Flink-11127

2019-02-17 Thread Boris Lublinsky
I was looking at this issue https://issues.apache.org/jira/browse/FLINK-11127 Apparently there is a workaround for it. Is it possible provide the complete helm chart for it. Bits and pieces are in the ticket, but it would be nice to see the full