I am avail to cut RCs when the fix goes in unless Phil wants to do it. Gary
On Wed, Jul 8, 2020 at 11:20 AM Phil Steitz <phil.ste...@gmail.com> wrote: > > On 7/5/20 7:08 PM, Phil Steitz wrote: > > > > > > On 7/5/20 6:02 PM, Phil Steitz wrote: > >> > >> > >> On 7/5/20 11:07 AM, Phil Steitz wrote: > >>> > >>> The test looks a little off to me. I am not sure I fully understand > >>> what it is trying to do, but I suspect that the reason that it fails > >>> sporadically (I have seen this myself) is that to succeed it needs > >>> to run two evictor cycles when it is set to wait for only one. I > >>> may be wrong as I don't fully follow the intent or logic of the test. > >>> > >>> Looks to me like what it is trying to do is: > >>> > >>> 1. Create and force initialization of a BDS with initialsize = 10, > >>> minIdle = 5, maxIdle = 10, numTestsPerEvictionRun = 3, > >>> timeBetweenEvictionRuns = 1000, minEvictableIdleTime = 100 > >>> 2. Execute a loop whose guard I don't understand that appears to > >>> wait for an eviction cycle to complete, waiting at least 1 > >>> second between iterations > >>> 3. Expect to see no more than 5 idle connections. > >>> > >>> I don't understand the search for eviction threads in the guard in > >>> 2. Just waiting would work if what we are trying to do is to verify > >>> that old connections are being evicted (basically duplicating tests > >>> in [pool]). In any case, in order to get down to 5, you need to run > >>> 2 cycles, because each only visits 3. It would seem to me that to > >>> make it succeed consistently, you need to either increase > >>> numTestsPerEvictionRun to 5 or have it wait for two cycles. > >>> > >>> Phil > >>> > >> > >> I think I now understand the intent of the test and I think that what > >> it is trying to do points to what may be a [pool] bug. The post > >> mentioned in the PR that created this test [1] may be onto something. > >> > >> It seems odd that closing a GKOP with no evictor can stop the evictor > >> for a different GOP, but I think it may be possible. I don't have any > >> more time to dig on this for now, but here is what I think may be > >> happening: > >> > >> As the OP stated in [1], the close on the GKOP for the statement pool > >> calls BaseGOP stopEvictor. This results in > >> EvictionTimer.cancel(evictor, evictorShutdownTimeoutMillis, > >> TimeUnit.MILLISECONDS) being called with a evictor null. This starts > >> with > >> > >> if (evictor != null) { > >> evictor.cancel(); > >> } > >> if (executor != null && executor.getQueue().isEmpty()) { > >> executor.shutdown(); > >> > >> } > >> > >> A quick look at the jdk sources for the executor makes me think it is > >> possible that if there is just one other evictor running and it is in > >> process of running a task when the test above is performed, the queue > >> may show as empty because it has taken the queued task, is executing > >> it and only re-queues afterward. That could cause the main executor > >> to be shut down. > >> > >> To make this happen, it would be best to add some latency to eviction > >> tasks and have another one open and close a bunch of others with > >> evictors disabled concurrently. If this turns out to be correct, a > >> simple fix would be to just exit immediately if evictor is null (a > >> null evictor can't be the last one). > >> > > Thinking about this some more, the quick exit above will handle the > > DBCP statement pool case, but the problem could also happen with > > non-null evictors. So the code probably needs to be modified to live > > up to the commitment in the EvictionTimer class javadoc (that it keeps > > track of its client pools and shuts down only when they have all shut > > down). > > > I have verified that this is in fact a [pool] bug. I have a unit test > that shows eviction stopped for an active pool as a result of closing > another pool while an eviction task is executing. I will open a [pool] > JIRA with the test case later today. I would expect the [dbcp] > statement pool use case to be where this happens most frequently, so it > may make sense to open a [dbcp] ticket as well, linked to the [pool] > issue. Agreed? > > I think that what is happening is what I describe above. > ScheduledThreadPoolExecutor reports an empty queue when it is executing > a task and only re-queues the next one when the executing instance is > finished. So if (executor != null && executor.getQueue().isEmpty()) > returns true when there is an eviction task in progress. The best and > simplest fix that I can think of is to add && executor.getActiveCount() > == 0 to the test above. That makes my test succeed and is almost > complete. The "almost" is due to the fact that the queue empty and > active count tests are not atomic. Unfortunately, STPE does not expose > its internal lock so I can't see any way to test atomically. In theory, > between the two tests, a task could complete and get re-queued, but that > is highly unlikely. Any better ideas on the fix? > > Phil > > > >> Phil > >> > >> Phil > >> > >> > >> [1] > >> > http://mail-archives.apache.org/mod_mbox/commons-user/202004.mbox/%3CCAD1sBYus4-j4i2Fs6dvzv_p4z0kaiu_8Yz8qUwXxTGAKp1ef1g%40mail.gmail.com%3E > >> > >>> On 7/5/20 6:04 AM, Gary Gregory wrote: > >>>> This must be some timing issue as the test passed on GitHub and > Travis, > >>>> but, I do see one failure for that on Java 11 on GitHub. > >>>> > >>>> I am guessing the test might need to be set up differently. > >>>> > >>>> Gary > >>>> > >>>> On Fri, Jul 3, 2020, 14:35 Robert Paschek< > robert.pasc...@comm-unity.at> > >>>> wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> when I'm running mvn clean test on the current master branch, > testEvict > >>>>> always fails with the same error. > >>>>> > >>>>> [ERROR] Failures: > >>>>> [ERROR] > TestAbandonedBasicDataSource>TestBasicDataSource.testEvict:946 > >>>>> EvictionTimer thread was destroyed with numIdle=7(expected: less or > equal > >>>>> than 5) > >>>>> [ERROR] TestBasicDataSource.testEvict:946 EvictionTimer thread was > >>>>> destroyed with numIdle=7(expected: less or equal than 5) > >>>>> [ERROR] > TestBasicManagedDataSource>TestBasicDataSource.testEvict:946 > >>>>> EvictionTimer thread was destroyed with numIdle=7(expected: less or > equal > >>>>> than 5) > >>>>> [ERROR] > >>>>> TestDataSourceXAConnectionFactory>TestBasicDataSource.testEvict:946 > >>>>> EvictionTimer thread was destroyed with numIdle=7(expected: less or > equal > >>>>> than 5) > >>>>> > >>>>> Is there something wrong with my setup? I'm using Maven 3.6.3 with > >>>>> AdoptOpenJDK 8.0.252. > >>>>> > >>>>> Thanks for help. > >>>>> > >>>>> Regards, > >>>>> Robert > >>>>> > >>>>> --------------------------------------------------------------------- > >>>>> To unsubscribe, e-mail:dev-unsubscr...@commons.apache.org > >>>>> For additional commands, e-mail:dev-h...@commons.apache.org > >>>>> > >>>>> >