The same has been asked in StackOverflow <https://stackoverflow.com/questions/66355206/flink-1-12-1-example-application-failing-on-a-single-node-yarn-cluster> also. Any suggestions here?
On Wed, Feb 24, 2021 at 10:25 PM Debraj Manna <subharaj.ma...@gmail.com> wrote: > I am trying out flink example as explained in flink docs > <https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/yarn.html#application-mode> > in > a single node yarn cluster > <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html#Standalone_Operation> > . > > On executing > > ubuntu@vrni-platform:~/build-target/flink$ ./bin/flink run-application -t > yarn-application ./examples/streaming/TopSpeedWindowing.jar > > It is failing with the below errors > > org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't > deploy Yarn Application Cluster > at > org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:465) > at > org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67) > at > org.apache.flink.client.cli.CliFrontend.runApplication(CliFrontend.java:213) > at > org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1061) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1136) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at > org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1136) > Caused by: > org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN > application unexpectedly switched to state FAILED during deployment. > Diagnostics from YARN: Application application_1614159836384_0045 failed 1 > times (global limit =2; local limit is =1) due to AM Container for > appattempt_1614159836384_0045_000001 exited with exitCode: -1000 > Failing this attempt.Diagnostics: [2021-02-24 16:19:39.409]File > file:/home/ubuntu/.flink/application_1614159836384_0045/flink-dist_2.12-1.12.1.jar > does not exist > java.io.FileNotFoundException: File > file:/home/ubuntu/.flink/application_1614159836384_0045/flink-dist_2.12-1.12.1.jar > does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:867) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442) > at > org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:242) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:235) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:223) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > I have made the log level DEBUG and I do see that flink-dist_2.12-1.12.1.jar > is getting copied to /home/ubuntu/.flink/application_1614159836384_0045. > > 2021-02-24 16:19:37,768 DEBUG > org.apache.flink.yarn.YarnApplicationFileUploader [] - Got > modification time 1614183577000 from remote path > file:/home/ubuntu/.flink/application_1614159836384_0045/TopSpeedWindowing.jar > 2021-02-24 16:19:37,769 DEBUG > org.apache.flink.yarn.YarnApplicationFileUploader [] - Copying > from file:/home/ubuntu/build-target/flink/lib/flink-dist_2.12-1.12.1.jar to > file:/home/ubuntu/.flink/application_1614159836384_0045/flink-dist_2.12-1.12.1.jar > with replication factor 1 > > The entire DEBUG logs are placed here > <https://gist.github.com/debraj-manna/a38addc37a322cb242fc66fab1f9fee7>. > Nodemanager logs are placed here > <https://gist.github.com/debraj-manna/3732616fe78db439c0d6453454e8e02b>. > > Can someone let me know what is going wrong? Does flink not support single > node yarn cluster for development? > > > >