In my setup hadoop-yarn-nodemenager is running with yarn user. ubuntu@vrni-platform:/tmp/flink$ ps -ef | grep nodemanager yarn 4953 1 2 05:53 ? 00:11:26 /usr/lib/jvm/java-8-openjdk/bin/java -Dproc_nodemanager -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/lib/heap-dumps/yarn -XX:+ExitOnOutOfMemoryError -Dyarn.log.dir=/var/log/hadoop-yarn -Dyarn.log.file=hadoop-yarn-nodemanager-vrni-platform.log -Dyarn.home.dir=/usr/lib/hadoop-yarn -Dyarn.root.logger=INFO,console -Djava.library.path=/usr/lib/hadoop/lib/native -Xmx512m -Dhadoop.log.dir=/var/log/hadoop-yarn -Dhadoop.log.file=hadoop-yarn-nodemanager-vrni-platform.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=yarn -Dhadoop.root.logger=INFO,RFA -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.yarn.server.nodemanager.NodeManager
I was executing the ./bin/flink command as ubuntu user and yarn user does not have permission to write to ubuntu's home folder in my setup. ubuntu@vrni-platform:/tmp/flink$ echo ~ubuntu /home/ubuntu ubuntu@vrni-platform:/tmp/flink$ echo ~yarn /var/lib/hadoop-yarn It appears to me flink needs permission to write to user's home directory to create a .flink folder even when the job is submitted in yarn. It is working fine for me if I run the flink with yarn user. in my setup. Just for my knowledge is there any config in flink to specify the location of .flink folder? On Thu, Feb 25, 2021 at 10:48 AM Debraj Manna <subharaj.ma...@gmail.com> wrote: > The same has been asked in StackOverflow > <https://stackoverflow.com/questions/66355206/flink-1-12-1-example-application-failing-on-a-single-node-yarn-cluster> > also. Any suggestions here? > > On Wed, Feb 24, 2021 at 10:25 PM Debraj Manna <subharaj.ma...@gmail.com> > wrote: > >> I am trying out flink example as explained in flink docs >> <https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/yarn.html#application-mode> >> in >> a single node yarn cluster >> <https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html#Standalone_Operation> >> . >> >> On executing >> >> ubuntu@vrni-platform:~/build-target/flink$ ./bin/flink run-application >> -t yarn-application ./examples/streaming/TopSpeedWindowing.jar >> >> It is failing with the below errors >> >> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't >> deploy Yarn Application Cluster >> at >> org.apache.flink.yarn.YarnClusterDescriptor.deployApplicationCluster(YarnClusterDescriptor.java:465) >> at >> org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer.run(ApplicationClusterDeployer.java:67) >> at >> org.apache.flink.client.cli.CliFrontend.runApplication(CliFrontend.java:213) >> at >> org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1061) >> at >> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1136) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:422) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) >> at >> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1136) >> Caused by: >> org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The >> YARN application unexpectedly switched to state FAILED during deployment. >> Diagnostics from YARN: Application application_1614159836384_0045 failed 1 >> times (global limit =2; local limit is =1) due to AM Container for >> appattempt_1614159836384_0045_000001 exited with exitCode: -1000 >> Failing this attempt.Diagnostics: [2021-02-24 16:19:39.409]File >> file:/home/ubuntu/.flink/application_1614159836384_0045/flink-dist_2.12-1.12.1.jar >> does not exist >> java.io.FileNotFoundException: File >> file:/home/ubuntu/.flink/application_1614159836384_0045/flink-dist_2.12-1.12.1.jar >> does not exist >> at >> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:867) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) >> at >> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442) >> at >> org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:269) >> at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67) >> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414) >> at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:422) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) >> at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:242) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:235) >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:223) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> >> I have made the log level DEBUG and I do see that flink-dist_2.12-1.12.1.jar >> is getting copied to /home/ubuntu/.flink/application_1614159836384_0045. >> >> 2021-02-24 16:19:37,768 DEBUG >> org.apache.flink.yarn.YarnApplicationFileUploader [] - Got >> modification time 1614183577000 from remote path >> file:/home/ubuntu/.flink/application_1614159836384_0045/TopSpeedWindowing.jar >> 2021-02-24 16:19:37,769 DEBUG >> org.apache.flink.yarn.YarnApplicationFileUploader [] - Copying >> from file:/home/ubuntu/build-target/flink/lib/flink-dist_2.12-1.12.1.jar to >> file:/home/ubuntu/.flink/application_1614159836384_0045/flink-dist_2.12-1.12.1.jar >> with replication factor 1 >> >> The entire DEBUG logs are placed here >> <https://gist.github.com/debraj-manna/a38addc37a322cb242fc66fab1f9fee7>. >> Nodemanager logs are placed here >> <https://gist.github.com/debraj-manna/3732616fe78db439c0d6453454e8e02b>. >> >> Can someone let me know what is going wrong? Does flink not support single >> node yarn cluster for development? >> >> >> >>