Re: Temporary failure in name resolution on JobManager

2019-12-02 Thread David Maddison
Thanks Yang. We did try both those properties and it didn't fix it. However, we did EVENTUALLY (after some late nights!) track the issue down, not to DNS resolution but rather an obscure bug our our connector code :-( Thanks for your response, /David/ On Mon, Dec 2, 2019 at 3:16 AM Yang Wang w

Re: Temporary failure in name resolution on JobManager

2019-12-01 Thread Yang Wang
Hi David, Do you mean when the JobManager starts, the dns has some problem and the service could not be resolved? The dns restores to normal, and the JobManager jvm could not look up the dns. I think it may because the jvm dns cache. You could set the ttl and have a try. sun.net.inetaddr.ttl sun.n

Re: Temporary failure in name resolution

2018-04-04 Thread Fabian Hueske
Hi, The issue might be related to garbage collection pauses during which the TM JVM cannot communicate with the JM. The metrics contain a stats for memory consumpion [1] and GC activity [2] that can help to diagnose the problem. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-re

Re: Temporary failure in name resolution

2018-04-03 Thread miki haiat
HI , i checked the code again the figure out where the problem can be i just wondered if im implementing the Evictor correctly ? full code https://gist.github.com/miko-code/6d7010505c3cb95be122364b29057237 public static class EsbTraceEvictor implements Evictor { org.slf4j.Logger LOG =

Re: Temporary failure in name resolution

2018-04-03 Thread Hao Sun
Hi Timo, we do have similar issue, TM got killed by a job. Is there a way to monitor JVM status? If through the monitor metrics, what metric I should look after? We are running Flink on K8S. Is there a possibility that a job consumes too much network bandwidth, so JM and TM can not connect? On Tue

Re: Temporary failure in name resolution

2018-04-03 Thread Timo Walther
Hi Miki, for me this sounds like your job has a resource leak such that your memory fills up and the JVM of the TaskManager is killed at some point. How does your job look like? I see a WindowedStream.apply which might not be appropriate if you have big/frequent windows where the evaluation h