On Tuesday, August 14, 2012 10:45:38 AM UTC+2, Richard Bywater wrote: > > Wild guess but are the builds happening on a Windows based slave and > is someone logging out whilst the builds are running? >
Thanks for the pointer! But that cannot be it - they are all linux slaves running with the SSH Slaves Plugin, and they are dedicated machines, nobody is interacting with them.. > > I've had problems in the past with this (its a thing you can get > around by passing the right argument -- -Xrs I think from memory) > > Might be nowhere near the issue but just in case :) > > Cheers > Richard. > > On Tue, Aug 14, 2012 at 8:41 PM, Lukas Rytz <lukas...@epfl.ch<javascript:>> > wrote: > > Well, that's unfortunately not the case. I changed our setup to never > run > > builds of > > the same job on the same machine in parallel, but the aborts still > happen. > > Just > > less often. > > > > The aborts always come in batches. The last batch was 48 aborts at the > same > > time, > > each producing the same message in the Jenkins log (see first post). > > > > I'm mostly wondering if no-one ever experienced this problem.. > > > > Lukas > > > > > > On Sunday, July 29, 2012 11:55:37 AM UTC+2, Lukas Rytz wrote: > >> > >> Further observation: it seems to happen only when running multiple > >> concurrent builds > >> of the same job on the same slave (but not when running multiple builds > on > >> separate > >> slaves, at least it seems that way currently). > >> > >> > >> > >> > >> On Saturday, July 28, 2012 3:04:41 PM UTC+2, Lukas Rytz wrote: > >>> > >>> Hi all, > >>> > >>> > >>> Lately we see quite a lot of jobs (~10 %) that just abort without any > >>> intervention. > >>> Somebody else ever had similar problems? > >>> > >>> No error message in the console output: > >>> > >>> [...] > >>> [partest] testing: > >>> [...]/run/reflection-constructormirror-nested-good.scala [ OK ] > >>> [partest] testing: [...]/files/run/viewtest.scala [ OK ] > >>> [partest] testing: [...]/files/run/reify_newimpl_20.scala [ OK ] > >>> Build was aborted > >>> Archiving artifacts > >>> Checking console output > >>> Email was triggered for: Aborted > >>> Sending email for trigger: Aborted > >>> > >>> The abort is not because of a timeout (build timeout plugin). > >>> The Jenkins logs say that the abort is due to an un-cougth > >>> InterruptedException, stack trace > >>> below. It always looks the same. > >>> > >>> I think the reason is an InterruptedException in master-slave > >>> communication. The slaves are > >>> connected over SSH using the "SSH Slaves Plugin". > >>> > >>> I don't think that the exception is caused by our testing tool - this > is > >>> running on the client in > >>> another (JVM) process, so even if it quits with an > InterruptedException, > >>> that should not abort > >>> the Jenkins build. > >>> > >>> > >>> Thanks for any pointers! > >>> Lukas > >>> > >>> > >>> > >>> Jenkins Log: > >>> > >>> INFO: scala-checkin #6609 aborted > >>> java.lang.InterruptedException > >>> at java.lang.Object.wait(Native Method) > >>> at hudson.remoting.Request.call(Request.java:146) > >>> at hudson.remoting.Channel.call(Channel.java:663) > >>> at > >>> > hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:158) > > > >>> at $Proxy36.join(Unknown Source) > >>> at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:861) > >>> at hudson.Launcher$ProcStarter.join(Launcher.java:345) > >>> at > hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:82) > >>> at > hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58) > >>> at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) > >>> at > >>> > hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:717) > > > >>> at hudson.model.Build$BuildExecution.build(Build.java:199) > >>> at hudson.model.Build$BuildExecution.doRun(Build.java:160) > >>> at > >>> > hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499) > > >>> at hudson.model.Run.execute(Run.java:1488) > >>> at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) > >>> at > hudson.model.ResourceController.execute(ResourceController.java:88) > >>> at hudson.model.Executor.run(Executor.java:236) > >>> > >>> > >>> > > >