Hi Robert, Thanks for your answer. Do you mean the log file in (e.g.) flink-0.10.0/log/flink-hadoop-client-ip-172-31-10-193.log? Or you mean another log file?
In this log, the error message is as follows: 08:16:03,437 INFO org.apache.flink.runtime.client.JobClient - Job execution complete 08:16:03,438 INFO org.apache.flink.api.java.ExecutionEnvironment - The job has 5 registered types and 0 default Kryo serializers 08:16:03,444 INFO org.apache.flink.runtime.client.JobClientActor - Received job Flink Java Job at Thu Jan 21 08:16:03 UTC 2016 (73f8ab2fbab61fb72dc4a53fd8dcbb9f). 08:16:03,444 INFO org.apache.flink.runtime.client.JobClientActor - Could not submit job Flink Java Job at Thu Jan 21 08:16:03 UTC 2016 (73f8ab2fbab61fb72dc4a53fd8dcbb9f), because there is no connection to a JobManager. 08:16:03,446 INFO org.apache.flink.runtime.client.JobClientActor - Connected to new JobManager akka.tcp://flink@172.31.5.123:34614/user/jobmanager. 08:16:03,446 INFO org.apache.flink.runtime.client.JobClientActor - Sending message to JobManager akka.tcp://flink@172.31.5.123:34614/user/jobmanager to submit job Flink Java Job at Thu Jan 21 08:16:03 UTC 2016 (73f8ab2fbab61fb72dc4a53fd8dcbb9f) and wait for progress 08:16:03,446 INFO org.apache.flink.runtime.client.JobClientActor - Upload jar files to job manager akka.tcp://flink@172.31.5.123:34614/user/jobmanager. 08:16:03,860 INFO org.apache.flink.runtime.client.JobClientActor - Submit job to the job manager akka.tcp://flink@172.31.5.123:34614/user/jobmanager. 08:16:03,860 INFO org.apache.flink.runtime.client.JobClient - Job execution failed 08:16:03,860 ERROR org.apache.flink.client.CliFrontend - Error while running the command. org.apache.flink.client.program.ProgramInvocationException: The main method caused an error. at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:512) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:395) at org.apache.flink.client.program.Client.runBlocking(Client.java:252) at org.apache.flink.client.CliFrontend.executeProgramBlocking(CliFrontend.java:675) at org.apache.flink.client.CliFrontend.run(CliFrontend.java:326) at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:977) at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1027) Caused by: java.lang.reflect.UndeclaredThrowableException at eu.amidst.flinklink.core.io.DataFlinkLoader.loadHeaderARFFFolder(DataFlinkLoader.java:176) at eu.amidst.flinklink.core.io.DataFlinkLoader.loadHeader(DataFlinkLoader.java:137) at eu.amidst.flinklink.core.io.DataFlinkLoader.access$000(DataFlinkLoader.java:43) at eu.amidst.flinklink.core.io.DataFlinkLoader$DataFlinkFile.<init>(DataFlinkLoader.java:281) at eu.amidst.flinklink.core.io.DataFlinkLoader.loadDataFromFolder(DataFlinkLoader.java:80) at eu.amidst.flinklink.core.io.DataFlinkLoader.loadDynamicDataFromFolder(DataFlinkLoader.java:90) at eu.amidst.flinklink.examples.reviewMeeting2015.GenerateData.createDataSetsDBN(GenerateData.java:194) at eu.amidst.flinklink.examples.reviewMeeting2015.GenerateData.main(GenerateData.java:208) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:497) ... 6 more Caused by: org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Could not upload the jar files to the job manager. at org.apache.flink.client.program.Client.runBlocking(Client.java:370) at org.apache.flink.client.program.Client.runBlocking(Client.java:348) at org.apache.flink.client.program.Client.runBlocking(Client.java:315) at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:70) at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:804) at org.apache.flink.api.java.DataSet.collect(DataSet.java:410) at eu.amidst.flinklink.core.io.DataFlinkLoader.loadHeaderARFFFolder(DataFlinkLoader.java:156) ... 18 more Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not upload the jar files to the job manager. at org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:338) at akka.dispatch.Futures$$anonfun$future$1.apply(Future.scala:94) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.io.IOException: PUT operation failed: Connection reset at org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:465) at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:327) at org.apache.flink.runtime.jobgraph.JobGraph.uploadRequiredJarFiles(JobGraph.java:525) at org.apache.flink.runtime.client.JobClient.uploadJarFiles(JobClient.java:292) at org.apache.flink.runtime.client.JobClientActor$2.call(JobClientActor.java:332) ... 10 more Caused by: java.net.SocketException: Connection reset at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113) at java.net.SocketOutputStream.write(SocketOutputStream.java:153) at org.apache.flink.runtime.blob.BlobUtils.writeLength(BlobUtils.java:262) at org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:451) ... 14 more That is, it finds the jar files well until almost the end of the execution. Actually, if I run it again it may or may not work. Ana On 20 Jan 2016, at 15:24, Robert Metzger <rmetz...@apache.org<mailto:rmetz...@apache.org>> wrote: Hi, can you check the log file of the JobManager you're trying to submit the job to? Maybe there you can find helpful information why it failed. On Wed, Jan 20, 2016 at 3:23 PM, Ana M. Martinez <a...@cs.aau.dk<mailto:a...@cs.aau.dk>> wrote: Hi all, I am running some experiments with flink in an Amazon cluster and every now and then (it seems to appear at random) I get the following IOException: > org.apache.flink.client.program.ProgramInvocationException: The program > execution failed: Could not upload the jar files to the job manager. Sometimes when it fails, I just try to run it again immediately afterwords and it works fine. Any idea on why that might be happening? Thanks, Ana