Shannon Carey created FLINK-4418:
------------------------------------
Summary: ClusterClient/ConnectionUtils#findConnectingAddress fails
immediately if InetAddress.getLocalHost throws exception
Key: FLINK-4418
URL: https://issues.apache.org/jira/browse/FLINK-4418
Project: Flink
Issue Type: Bug
Components: Client
Affects Versions: 1.1.0
Reporter: Shannon Carey
When attempting to connect to a cluster with a ClusterClient, if the machine's
hostname is not resolvable to an IP, an exception is thrown preventing success.
This is the case if, for example, the hostname is not present & mapped to a
local IP in /etc/hosts.
The exception is below. I suggest that findAddressUsingStrategy() should catch
java.net.UnknownHostException thrown by InetAddress.getLocalHost() and return
null, allowing alternative strategies to be attempted by
findConnectingAddress(). I will open a PR to this effect. Ideally this could be
included in both 1.2 and 1.1.2.
{code}
21:11:35 org.apache.flink.client.program.ProgramInvocationException: Failed to
retrieve the JobManager gateway.
21:11:35 at
org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:430)
21:11:35 at
org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:90)
21:11:35 at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:389)
21:11:35 at
org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:75)
21:11:35 at
org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:334)
21:11:35 at
com.expedia.www.flink.job.scheduler.FlinkJobSubmitter.get(FlinkJobSubmitter.java:81)
21:11:35 at
com.expedia.www.flink.job.scheduler.streaming.StreamingJobManager.run(StreamingJobManager.java:105)
21:11:35 at
com.expedia.www.flink.job.scheduler.JobScheduler.runStreamingApp(JobScheduler.java:69)
21:11:35 at
com.expedia.www.flink.job.scheduler.JobScheduler.main(JobScheduler.java:34)
21:11:35 Caused by: java.lang.RuntimeException: Failed to resolve JobManager
address at /10.2.89.80:43126
21:11:35 at
org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:189)
21:11:35 at
org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:649)
21:11:35 at
org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:428)
21:11:35 ... 8 more
21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47:
ip-10-2-64-47: unknown error
21:11:35 at java.net.InetAddress.getLocalHost(InetAddress.java:1505)
21:11:35 at
org.apache.flink.runtime.net.ConnectionUtils.findAddressUsingStrategy(ConnectionUtils.java:232)
21:11:35 at
org.apache.flink.runtime.net.ConnectionUtils.findConnectingAddress(ConnectionUtils.java:123)
21:11:35 at
org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:187)
21:11:35 ... 10 more
21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47: unknown error
21:11:35 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
21:11:35 at
java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
21:11:35 at
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
21:11:35 at java.net.InetAddress.getLocalHost(InetAddress.java:1500)
21:11:35 ... 13 more
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)