On 9/11/11 1:49 PM, Phil Steitz wrote: > I don't understand why exactly, but testMaxActivePerKeyExceeded is > now hanging regularly for me using 1.6.0_26 (Apple Lion). In the > thread dump, I get the driving test thread: > > "main" prio=5 tid=7feb5b001800 nid=0x1057f1000 waiting on condition > [1057ee000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.commons.pool2.impl.TestGenericKeyedObjectPool.runTestThreads(TestGenericKeyedObjectPool.java:508) > at > org.apache.commons.pool2.impl.TestGenericKeyedObjectPool.testMaxActivePerKeyExceeded(TestGenericKeyedObjectPool.java:1274) > > And then all of the TestThreads waiting like this: > > "Thread-86" prio=5 tid=7feb5c95e000 nid=0x10dbdd000 waiting on > condition [10dbdc000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <7f3038fd8> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:469) > at > org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:757) > at > org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:717) > at > org.apache.commons.pool2.impl.TestGenericKeyedObjectPool$TestThread.run(TestGenericKeyedObjectPool.java:1418) > > > >From the line GKOP line number, you can see that what has happened > is that all of the threads have hit the pool when it was exhausted, > failed to create new instances and then gone fishing. I am not sure > exactly how it is happening, but I suspect what is going on is that > clearOldest is over-zealously killing idle instances before waiting > threads can get to them.
I think I know what is happening here. I am working on a test case to confirm. I think what is happening is 0) client threads check out and return instances under random keys 1) clearOldest gets invoked by a borrower which causes some idle instances to be destroyed 2) destroy has measurable latency in the config for the hanging test. When an instance is marked to be destroyed, its associated capacity is not returned to the pool until destroy completes (i.e., no create can be started in its place until the destroy completes - see where numTotal.decrementAndGet() appears in destroy) 3) threads that "should" be allowed to create can't when they arrive, because destroys in progress have not completed. Threads then park and wait forever (or until a new thread arrives and creates). For all of this to work, the thread in 1) has to exit create without being served, which is possible if the key that it is looking for is maxed. Then we need more bad luck to follow - none of the parked threads arriving in 3) are going after that key. This kind of starvation might happen in simpler scenarios. For example, if GOP testOnReturn hits validation failures and destroy has latency. Or when invalidateObject is called while all clients are parked. It seems we may either need a way to wake up the parked threads and allow them to do creates, initiate the creates once the destroys complete, or loosen up on not letting the creates happen until destroys have completed. Phil > > Phil > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org