Hi,

Me and my team have been struggling with a hang on our Windows 10 slaves, 
that is being caused by a bad interaction of loading JNA modules.  It looks 
like the SwapSpaceMonitor tries load one module, at the same time that the 
Kernel32Utils.getWin32FileAttributes() call tries to load another JNA 
module, and the slave hangs.

What's really frustrating about this is that it causes all builds to hang, 
not just the single build on the slave.  This deadlock situation seems to 
cause the Jenkins master to just stop handling any other Pipeline builds.

Below are the 2 thread stacks that really symbolize the problem.  And here 
is the Jira report where I have been keeping full thread dumps and my 
investigation notes:
https://issues.jenkins-ci.org/browse/JENKINS-39179

We've been able to recreate this on all recent versions of Jenkins (2.26, 
2.19.1 and 2.19.2 LTS editions) and even tried swapping Java 7/8 on the 
endpoint to get it to go away.  It just keeps happening, and when it does 
-- all builds stop until I kill the slave with the deadlock.  This is 
happening basically 1 to 2 times a day, and when it happens, I get 30 
people screaming at me.  It was my idea to move us over to Jenkins 2.X and 
Pipeline style builds, so I'm getting a lot of heat about this.

What's amazing to me too is:  How is this only happening to us?  It seems 
like the kind of lock up problem that would be occurring to lots of people 
-- We're not doing anything that strange or unique.

And I've found a few bugs with references to similar thread stacks, and 
most are from about 1 year ago.  So why this big problem now.

Anyway, I'm just putting this out there because I'm hoping to find someone 
else that this is happening to, so we can compare notes.  

Cheers,
Greg



"pool-1-thread-4 for channel" Id=16 Group=main RUNNABLE
    at com.sun.jna.Native.initIDs(Native Method)
    at com.sun.jna.Native.<clinit>(Native.java:148)
    at hudson.util.jna.Kernel32Utils.load(Kernel32Utils.java:112)
    at hudson.util.jna.Kernel32.<clinit>(Kernel32.java:37)
    at 
hudson.util.jna.Kernel32Utils.getWin32FileAttributes(Kernel32Utils.java:77)
    at 
hudson.util.jna.Kernel32Utils.isJunctionOrSymlink(Kernel32Utils.java:98)
    at hudson.Util.isSymlink(Util.java:510)
    at hudson.FilePath.deleteRecursive(FilePath.java:1221)
    at hudson.FilePath.access$1000(FilePath.java:195)
    at hudson.FilePath$14.invoke(FilePath.java:1201)
    at hudson.FilePath$14.invoke(FilePath.java:1198)
    at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2772)
    at hudson.remoting.UserRequest.perform(UserRequest.java:153)
    at hudson.remoting.UserRequest.perform(UserRequest.java:50)
    at hudson.remoting.Request$2.run(Request.java:332)
    at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    Number of locked synchronizers = 1
    - java.util.concurrent.ThreadPoolExecutor$Worker@11c41d99

pool-1-thread-6 for channel

"pool-1-thread-6 for channel" Id=26 Group=main RUNNABLE
    at com.sun.jna.Pointer.<clinit>(Pointer.java:41)
    at com.sun.jna.Structure.<clinit>(Structure.java:2078)
    at org.jvnet.hudson.Windows.monitor(Windows.java:42)
    at 
hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call(SwapSpaceMonitor.java:124)
    at 
hudson.node_monitors.SwapSpaceMonitor$MonitorTask.call(SwapSpaceMonitor.java:114)
    at hudson.remoting.UserRequest.perform(UserRequest.java:153)
    at hudson.remoting.UserRequest.perform(UserRequest.java:50)
    at hudson.remoting.Request$2.run(Request.java:332)
    at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

    Number of locked synchronizers = 1
    - java.util.concurrent.ThreadPoolExecutor$Worker@17011895

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to jenkinsci-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-users/45bdea71-95e9-4841-b18d-9b83a8fa673b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to