Hi Martin,

Could you `docker exec` into the problematic taskmanager and check whether
the hostname could
be resolved to a correct ip? You could use `nslookup {tm_hostname}` to
verify.


Best,
Yang

Martin, Nick J [US] (IS) <nick.mar...@ngc.com> 于2019年12月21日周六 上午6:07写道:

> I’m running Flink 1.7.2 in a Docker swarm. Intermittently, new task
> managers will fail to resolve their own host names when starting up. In the
> log I see “no hostname could be resolved” messages coming from
> TaskManagerLocation. The webUI on the jobmanager shows the taskmanagers as
> are associated/connected with the jobmanager, but their akka paths show
> their IP, rather than the container name that ‘good’ taskmanager show.
> Those taskmanagers that are listed by IP give ‘failed to connect’ errors
> when new jobs are started that try to use those taskmanagers, and that job
> eventually fails. But the taskmanagers with this condition still give
> regular heartbeats to the Jobmanager, so the jobmanager keeps trying to
> assign work to them. Does anyone know what’s going on here?
>

Reply via email to