Hi all, I am upgrading my DataSet jobs from Flink 1.8 to 1.12. After the upgrade I started to receive the errors like this one:
14:12:57,441 INFO org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager - Worker container_e120_1608377880203_0751_01_000112 is terminated. Diagnostics: Resource hdfs://bigdata/user/hadoop/.flink/application_1608377880203_0751/jobs.jar changed on src filesystem (expected 1610892446439, was 1610892446971 java.io.IOException: Resourceh dfs://bigdata/user/hadoop/.flink/application_1608377880203_0751/jobs.jar changed on src filesystem (expected 1610892446439, was 1610892446971 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:257) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:228) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:221) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.call(ContainerLocalizer.java:209) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) I understand it is somehow related to FLINK-12195, but this time it comes from the Hadoop code. I am running a very old version of the HDP platform v.2.6.5 so it might be the one to blame. But the code was working perfectly fine before the upgrade, so I am confused. Could you please advise. Thank you! Mark