I think I may have figured this out. I reopened and added a comment to POOL-411.
Phil On Thu, Jul 6, 2023 at 2:41 PM Phil Steitz <phil.ste...@gmail.com> wrote: > <snip> > > I guess it's good news that CI hit the error below when reviewing the PR > that I had prepared for the POOL-391 fixes. I only saw it once in many > test runs and only on OpenJDK 20.0.1. Looks like CI is running 17 on > azure-linux. I am pretty sure it has nothing to do with the changes in the > PR, partly because I saw it once running the first RC code. > > This is very strange and troubling. More eyeballs most welcome. Here is > the problem: > > The NPE below happens inside GKOP addIdleObject, which is called by > addObject. The code for addObject follows the standard pattern: > register(key); > try { > addIdleObject(key, create(key)); > } finally { > deregister(key); > } > > So addIdleObject is called while the owning thread has the key > "registered." To register is key is basically sayinig you are interested > in / about to do something to its associated pool. There is a > numInterested counter (attached to the keyed pool) that gets incremented > when a thread registers a key and decrremented when the key is > deregistered. Registration creates a pool and iniitializes its counter if > there is no pool under the given key. When pools are deregistered, the > counter is checked and if the pool has no instances under management and > the counter is zero, the pool is removed. In this case, the registration > in addObject should prevent the pool being removed before or during > execution of addIdleObject, but the NPE means it has been removed. Somehow > numInterested is getting corrupted. > > I will keep trying to get this to happen and see if I can find scenarios > where deregister is somehow called twice for one register. Any suggestions > or success getting this to happen with the code now in master most > appreciated. > > Phil > >> 2. On MacOS 13.4.1, OpenJDK 20.0.1, I got the following test failure just >> one time and can't reproduce: >> >> java.util.concurrent.ExecutionException: java.lang.NullPointerException: >> Cannot invoke >> "org.apache.commons.pool2.impl.GenericKeyedObjectPool$ObjectDeque.getIdleObjects()" >> because the return value of "java.util.Map.get(Object)" is null >> >> at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) >> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191) >> at >> org.apache.commons.pool2.impl.TestGenericKeyedObjectPool.lambda$testConcurrentBorrowAndClear$2(TestGenericKeyedObjectPool.java:1056) >> ... 71 more >> Caused by: java.lang.NullPointerException: Cannot invoke >> "org.apache.commons.pool2.impl.GenericKeyedObjectPool$ObjectDeque.getIdleObjects()" >> because the return value of "java.util.Map.get(Object)" is null >> at >> org.apache.commons.pool2.impl.GenericKeyedObjectPool.addIdleObject(GenericKeyedObjectPool.java:307) >> at >> org.apache.commons.pool2.impl.GenericKeyedObjectPool.addObject(GenericKeyedObjectPool.java:332) >> at >> org.apache.commons.pool2.KeyedObjectPool.addObjects(KeyedObjectPool.java:136) >> at >> org.apache.commons.pool2.KeyedObjectPool.addObjects(KeyedObjectPool.java:113) >> at >> org.apache.commons.pool2.impl.TestGenericKeyedObjectPool.lambda$testConcurrentBorrowAndClear$0(TestGenericKeyedObjectPool.java:1036) >> at >> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:577) >> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) >> at >> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) >> at >> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) >> at java.base/java.lang.Thread.run(Thread.java:1623) >> >> >> >>