What's in the container log for the container that failed?

On Sep 11, 2017 2:17 AM, "Sridhar Chellappa" <flinken...@gmail.com> wrote:

I am trying to start Flink(Version 1.3.0) on YARN (Hadoop 2.8.1) by issuing
the following command:

~/flink-1.3.0/bin/yarn-session.sh -s 4 -n 10 -jm 4096 -tm 4096-d

I am seeing a flurry of these Errors:

2017-09-11 08:17:11,410 INFO  org.apache.flink.yarn.
YarnClusterDescriptor                   - Deployment took more than 60
seconds. Please check if the requested resources are available in the YARN
cluster
2017-09-11 08:17:11,661 INFO  org.apache.flink.yarn.
YarnClusterDescriptor                   - Deployment took more than 60
seconds. Please check if the requested resources are available in the YARN
cluster
2017-09-11 08:17:11,912 INFO  org.apache.flink.yarn.
YarnClusterDescriptor                   - Deployment took more than 60
seconds. Please check if the requested resources are available in the YARN
cluster
2017-09-11 08:17:12,163 INFO  org.apache.flink.yarn.
YarnClusterDescriptor                   - Deployment took more than 60
seconds. Please check if the requested resources are available in the YARN
cluster


And then, my deployment fails with the following exception :

Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
    at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(
AbstractYarnClusterDescriptor.java:439)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(
FlinkYarnSessionCli.java:630)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(
FlinkYarnSessionCli.java:486)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(
FlinkYarnSessionCli.java:483)
    at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(
HadoopSecurityContext.java:43)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1548)
    at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(
HadoopSecurityContext.java:40)
    at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(
FlinkYarnSessionCli.java:483)
Caused by: 
org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException:
The YARN application unexpectedly switched to state FAILED during
deployment.
Diagnostics from YARN: Application application_1504851547322_0003 failed 2
times due to AM Container for appattempt_1504851547322_0003_000002 exited
with  exitCode: 31
Failing this attempt.Diagnostics: Exception from container-launch.
Container id: container_1504851547322_0003_02_000001
Exit code: 31
Stack trace: ExitCodeException exitCode=31:
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:972)
    at org.apache.hadoop.util.Shell.run(Shell.java:869)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(
Shell.java:1170)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.
launchContainer(DefaultContainerExecutor.java:236)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.
launcher.ContainerLaunch.call(ContainerLaunch.java:305)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.
launcher.ContainerLaunch.call(ContainerLaunch.java:84)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)



Further Debugging at the JobManager logs shows :

Resetting connection and trying again with a new connection.
2017-09-11 08:17:11,820 INFO  org.apache.zookeeper.ZooKeeper
                     - Initiating client connection,
connectString=high-availability.zookeeper.quorum:
10.200.0.6:2181,10.200.0.7:2181,10.200.0.9:2181 sessionTimeout=60000
watcher=org.apache.flink.shaded.org.apache.curator.ConnectionState@57bd802b
2017-09-11 08:17:11,927 ERROR
org.apache.flink.yarn.YarnApplicationMasterRunner             - YARN
Application Master initialization failed
java.net.UnknownHostException: high-availability.zookeeper.quorum:
10.200.0.6: Name or service not known
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
        at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
        at java.net.InetAddress.getAllByName(InetAddress.java:1192)
        at java.net.InetAddress.getAllByName(InetAddress.java:1126)
        at 
org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)


any help in figuring this out will be appreciated

Reply via email to