On 7/5/20 7:08 PM, Phil Steitz wrote:


On 7/5/20 6:02 PM, Phil Steitz wrote:


On 7/5/20 11:07 AM, Phil Steitz wrote:

The test looks a little off to me.  I am not sure I fully understand what it is trying to do, but I suspect that the reason that it fails sporadically (I have seen this myself) is that to succeed it needs to run two evictor cycles when it is set to wait for only one.  I may be wrong as I don't fully follow the intent or logic of the test.

Looks to me like what it is trying to do is:

 1. Create and force initialization of a BDS with initialsize = 10,
    minIdle = 5, maxIdle = 10, numTestsPerEvictionRun = 3,
    timeBetweenEvictionRuns = 1000, minEvictableIdleTime = 100
 2. Execute a loop whose guard I don't understand that appears to
    wait for an eviction cycle to complete, waiting at least 1
    second between iterations
 3. Expect to see no more than 5 idle connections.

I don't understand the search for eviction threads in the guard in 2.  Just waiting would work if what we are trying to do is to verify that old connections are being evicted (basically duplicating tests in [pool]).  In any case, in order to get down to 5, you need to run 2 cycles, because each only visits 3.  It would seem to me that to make it succeed consistently, you need to either increase numTestsPerEvictionRun to 5 or have it wait for two cycles.

Phil


I think I now understand the intent of the test and I think that what it is trying to do points to what may be a [pool] bug.  The post mentioned in the PR that created this test [1] may be onto something.

It seems odd that closing a GKOP with no evictor can stop the evictor for a different GOP, but I think it may be possible. I don't have any more time to dig on this for now, but here is what I think may be happening:

As the OP stated in [1], the close on the GKOP for the statement pool calls BaseGOP stopEvictor.  This results in EvictionTimer.cancel(evictor, evictorShutdownTimeoutMillis, TimeUnit.MILLISECONDS) being called with a evictor null.  This starts with

if (evictor != null) {
            evictor.cancel();
}
if (executor != null && executor.getQueue().isEmpty()) {
            executor.shutdown();

}

A quick look at the jdk sources for the executor makes me think it is possible that if there is just one other evictor running and it is in process of running a task when the test above is performed, the queue may show as empty because it has taken the queued task, is executing it and only re-queues afterward.  That could cause the main executor to be shut down.

To make this happen, it would be best to add some latency to eviction tasks and have another one open and close a bunch of others with evictors disabled concurrently.  If this turns out to be correct, a simple fix would be to just exit immediately if evictor is null (a null evictor can't be the last one).

Thinking about this some more, the quick exit above will handle the DBCP statement pool case, but the problem could also happen with non-null evictors.  So the code probably needs to be modified to live up to the commitment in the EvictionTimer class javadoc (that it keeps track of its client pools and shuts down only when they have all shut down).


I have verified that this is in fact a [pool] bug.  I have a unit test that shows eviction stopped for an active pool as a result of closing another pool while an eviction task is executing.  I will open a [pool] JIRA with the test case later today.  I would expect the [dbcp] statement pool use case to be where this happens most frequently, so it may make sense to open a [dbcp] ticket as well, linked to the [pool] issue.  Agreed?

I think that what is happening is what I describe above. ScheduledThreadPoolExecutor reports an empty queue when it is executing a task and only re-queues the next one when the executing instance is finished.  So if (executor != null && executor.getQueue().isEmpty()) returns true when there is an eviction task in progress.  The best and simplest fix that I can think of is to add && executor.getActiveCount() == 0 to the test above.  That makes my test succeed and is almost complete.  The "almost" is due to the fact that the queue empty and active count tests are not atomic.  Unfortunately, STPE does not expose its internal lock so I can't see any way to test atomically. In theory, between the two tests, a task could complete and get re-queued, but that is highly unlikely.  Any better ideas on the fix?

Phil


Phil

Phil


[1] http://mail-archives.apache.org/mod_mbox/commons-user/202004.mbox/%3CCAD1sBYus4-j4i2Fs6dvzv_p4z0kaiu_8Yz8qUwXxTGAKp1ef1g%40mail.gmail.com%3E

On 7/5/20 6:04 AM, Gary Gregory wrote:
This must be some timing issue as the test passed on GitHub and Travis,
but, I do see one failure for that on Java 11 on GitHub.

I am guessing the test might need to be set up differently.

Gary

On Fri, Jul 3, 2020, 14:35 Robert Paschek<robert.pasc...@comm-unity.at>
wrote:

Hello,

when I'm running mvn clean test on the current master branch, testEvict
always fails with the same error.

[ERROR] Failures:
[ERROR]   TestAbandonedBasicDataSource>TestBasicDataSource.testEvict:946
EvictionTimer thread was destroyed with numIdle=7(expected: less or equal
than 5)
[ERROR]   TestBasicDataSource.testEvict:946 EvictionTimer thread was
destroyed with numIdle=7(expected: less or equal than 5)
[ERROR]   TestBasicManagedDataSource>TestBasicDataSource.testEvict:946
EvictionTimer thread was destroyed with numIdle=7(expected: less or equal
than 5)
[ERROR]
  TestDataSourceXAConnectionFactory>TestBasicDataSource.testEvict:946
EvictionTimer thread was destroyed with numIdle=7(expected: less or equal
than 5)

Is there something wrong with my setup? I'm using Maven 3.6.3 with
AdoptOpenJDK 8.0.252.

Thanks for help.

Regards,
Robert

---------------------------------------------------------------------
To unsubscribe, e-mail:dev-unsubscr...@commons.apache.org
For additional commands, e-mail:dev-h...@commons.apache.org


Reply via email to