I see jobs failing with "FATAL: command execution failed" caused by java.io.EOFException. I understand the connection to the remote node drops, so java.io.EOFException is raised. Is it possible to re-run the job remotely without failing it at all? I've seen the Naginator plugin, but it seems to start a new separate job if the current one fails, which is not what I want. The failure would bubble up to the parent job which started it and cause it to fail.
Or is it possible for the parent job to check the status of the child jobs started by it, and "manually" restart the ones failing due to java.io.EOFException until they succeed? Building remotely on instance-1 (tag1) in workspace /var/lib/jenkins/workspace/eval [vmu-eval-single] $ /bin/sh -xe /tmp/jenkins4616924287086740166.sh + /path/to/evaluation-tool FATAL: command execution failed java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) Caused: java.io.IOException: Backing channel 'instance-1' is disconnected. at hudson.remoting.RemoteInvocationHandler.channelOrFail(RemoteInvocationHandler.java:214) at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:283) at com.sun.proxy.$Proxy78.isAlive(Unknown Source) at hudson.Launcher$RemoteLauncher$ProcImpl.isAlive(Launcher.java:1144) at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1136) at hudson.tasks.CommandInterpreter.join(CommandInterpreter.java:155) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:109) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1816) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) FATAL: Unable to delete script file /tmp/jenkins4616924287086740166.sh java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2681) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:3156) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:862) at java.io.ObjectInputStream.<init>(ObjectInputStream.java:358) at hudson.remoting.ObjectInputStreamEx.<init>(ObjectInputStreamEx.java:49) at hudson.remoting.Command.readFrom(Command.java:140) at hudson.remoting.Command.readFrom(Command.java:126) at hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:36) at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:63) Caused: java.io.IOException: Unexpected termination of the channel at hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:77) Caused: hudson.remoting.ChannelClosedException: Channel "unknown": Remote call on instance-eval-worker-25 failed. The channel is closing down or has closed down at hudson.remoting.Channel.call(Channel.java:950) at hudson.FilePath.act(FilePath.java:1069) at hudson.FilePath.act(FilePath.java:1058) at hudson.FilePath.delete(FilePath.java:1539) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:123) at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:66) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20) at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:744) at hudson.model.Build$BuildExecution.build(Build.java:206) at hudson.model.Build$BuildExecution.doRun(Build.java:163) at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:504) at hudson.model.Run.execute(Run.java:1816) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43) at hudson.model.ResourceController.execute(ResourceController.java:97) at hudson.model.Executor.run(Executor.java:429) Build step 'Execute shell' marked build as failure Finished: FAILURE -- You received this message because you are subscribed to the Google Groups "Jenkins Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-users/c4aaddae-b82e-448a-893a-844bc50d515c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.