Hello,
I've run into a concurrency issue manifested in application becoming frozen due
to pinned virtual threads.
Here I've described the reproduction steps in details:
https://stackoverflow.com/questions/78790376/spring-boot-application-gets-stuck-when-virtual-threads-are-used-on-java-21
The problem is manifested when you run the Spring Boot application having
virtual threads enabled. Under the hood my demo application has a feign client
using connection pool with up to 20 connections (threshold cannot be increased
due to a configuration bug), and as soon as you try to make more than 20
simultaneous request (even 21), the pool gets exhausted, meaning that upcoming
requests have to wait for a connection released, and the application gets stuck
(though it doesn't when platform threads are used i.e. when configuration
property spring.threads.virtual.enabled is false).
Running the code with -Djdk.tracePinnedThreads=full I've identified the cause
more precisely: it is located within AbstractConnPool.getPoolEntryBlocking().
Here's the link to the pinned threads stack trace:
https://github.com/stsypanov/concurrency-demo/blob/master/pinned-threads.txt
In the file pay attention to these lines:
12
org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:319)
92
org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:391)
Now let's examine the source code of
o.a.h.p.AbstractConnPool.getPoolEntryBlocking over here:
https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/pool/AbstractConnPool.java
In this class we have a ReentrantLock (encouraged to be used with virtual
threads instead of synchronized blocks) and its Condition:
private final Lock lock; private final Condition condition;
public AbstractConnPool() {
this.lock = new ReentrantLock();
this.condition = this.lock.newCondition();
}
Later in method AbstractConnPool.getPoolEntryBlocking() we have this logic:
private E getPoolEntryBlocking() {
this.lock.lock();
// line 319
try {
for (;;) {
try {
if (deadline != null) {
success = this.condition.awaitUntil(deadline);
} else {
this.condition.await(); //
line 391
success = true;
}
}
}
} finally {
this.lock.unlock();
}
}
This code works with platform threads but gets stuck with virtual ones. If one
gets thread dump of the stuck application there'll be 20 workers in
ForkJoinPool and each will have the same stack trace (with different ids, of
course):
"ForkJoinPool-1-worker-1" prio=0 tid=0x0 nid=0x0 waiting on condition
java.lang.Thread.State: WAITING
on java.lang.VirtualThread@121c8328 owned by "tomcat-handler-123" Id=214
at [email protected]/jdk.internal.vm.Continuation.run(Continuation.java:248)
at
[email protected]/java.lang.VirtualThread.runContinuation(VirtualThread.java:245)
at
[email protected]/java.lang.VirtualThread$$Lambda/0x000001579b475d08.run(Unknown
Source)
at
[email protected]/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(ForkJoinTask.java:1726)
at
[email protected]/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(ForkJoinTask.java:1717)
at
[email protected]/java.util.concurrent.ForkJoinTask$InterruptibleTask.exec(ForkJoinTask.java:1641)
at
[email protected]/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at
[email protected]/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1489)
at
[email protected]/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2071)
at
[email protected]/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2033)
at
[email protected]/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
As you see from the code above, the issue is still there on Java 22, also it's
reproducible with other distributions of JDK (e.g. Liberica JDK).
I think this is a bug somewhere in the JVM, as the ending point of the
stacktrace is native Continuation.enterSpecial(), otherwise the behavior would
be the same regardless of platform or virtual threads.
Regards,
Sergey Tsypanov