Hi all, For a while we've seen occasional hangs during the artifact stage of some jobs. Of course they always seem to happen when we're under the gun and need a critical build, as was the case this morning ;-)
Originally I thought that these hangs might be a disconnect between what the job is building and the list of artifacts. For example, I've definitely seen issues where if I change our build to create a new artifact, but there are jobs running that don't "know" how to produce the new artifact, the artifacting stage hangs. I've always theorized that in this case Jenkins searches our tree for the missing artifact, and is either bogged down by the shear size of our source tree (which is huge), or is confused by our use of symlinks in the tree. But to be honest, I've never tried to do more than guess at the problem. I *did* make a change like this a few days ago, and did see an artifacting hang. Since this was expected, I terminated that build. After that new jobs succeeded, until I got a hang out of the blue (with no build job changes) this morning. Here's the stack trace from the build slave that is hung: --snip-- "Executor #1 for MacBuildSlave-Speedy : executing ClientMacFullInstaller #700" Id=58 Group=main TIMED_WAITING on [B@5c8c89e2 at java.lang.Object.wait(Native Method) - waiting on [B@5c8c89e2 at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:173) at hudson.util.HeadBufferingStream.read(HeadBufferingStream.java:61) at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:221) at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141) at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:90) at org.apache.tools.tar.TarBuffer.readBlock(TarBuffer.java:257) at org.apache.tools.tar.TarBuffer.readRecord(TarBuffer.java:223) at hudson.org.apache.tools.tar.TarInputStream.read(TarInputStream.java:345) at java.io.FilterInputStream.read(FilterInputStream.java:90) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1025) at org.apache.commons.io.IOUtils.copy(IOUtils.java:999) at hudson.util.IOUtils.copy(IOUtils.java:36) at hudson.FilePath.readFromTar(FilePath.java:1759) at hudson.FilePath.copyRecursiveTo(FilePath.java:1685) at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:116) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:703) at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:678) at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:656) at hudson.model.Build$RunnerImpl.post2(Build.java:162) at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:625) at hudson.model.Run.run(Run.java:1435) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:238) --snip-- Note that we're running Jenkins 1.455. I really want to get back to a state where we can trust that Jenkins will deliver builds without falling over. Is the above information helpful, or is more needed? Note that I've got a complete threadDump stack trace of all the slaves and threads in case that is helpful, but it seemed like way too much to post here. Is there anything else I can do to help diagnose this problem while it's happening? Unfortunately I cannot leave the build hung forever. I'll likely have to stop and restart it sometime today, so if anyone has suggestions as to what I can do next to look at the problem while it's still in front of me, please let me know soon. Best, -- Allen Cronce