TestDescription.java still failing [v2]

Kevin Walls Wed, 29 Jun 2022 01:31:21 -0700

On Tue, 28 Jun 2022 11:43:42 GMT, Kevin Walls <kev...@openjdk.org> wrote:


>> Test has been problemlisted for a long time due to intermittent failures.
>> 
>> This is a difficult test as it tries to monitor usage thresholds on Memory 
>> Pools which are outside its control.
>> Not just Java heap pools, where the allocation it makes may or may not 
>> affect a particuclar pool, but non-heap pools such as CodeHeap and Metadata, 
>> where other activity in the VM can affect their usage and surprise the test.
>> 
>> The test iterates JMX memory pools where thresholds are supported, sets a 
>> threshold one byte higher than current usage, and makes an allocation.  This 
>> only makes sense on Java heap pools.  It is tempting to skip non-heap pools, 
>> but this test can still give a sanity test about threshold behaviour.  That 
>> is actually its main purpose, as the allocation is unlikely to affect the 
>> pool being tested.
>> 
>> With the changes here, I'm seeing the test and all its variations pass 
>> reliably, i.e. 50 iterations in each tested platform.
>> 
>> Skip testing a non-heap memory pool, e.g. CodeHeap, if it is hitting the 
>> threshold while we test, because that means it is changing outside our 
>> control.  Also re-test isExceeded on failure, as fetching the usage and 
>> isExceeded is a race.
>> 
>> Logging of more pool stats to better understand failures.
>
> Kevin Walls has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Show log output

> 

Thanks Thomas -

It's not a great test. 8-)

It is an old test, and has been problemlisted for a long time.  That means it 
isn't run all the time, but does get run, and can fail, so causes bug reports 
and takes up people's time.

I went with making it more robust, in that I used to be able to see false 
positives frequently, and now I can see none.  If there are future failures, I 
would revisit.

Yes using peak usage is good, it already did do that, but it was confusing when 
it prints the result of monitor.getPeakUsage(), then makes a comparison by 
calling monitor.getPeakUsage() again - you may not get the same value so what 
we log and what we compare aren't the same.

I try to avoid the races in this far from ideal test, and noticing when the 
pool is changing outside our control avoids false failures in CodeHeaps which I 
was seeing frequently. 

The small allocation and then checking if thresholds are reached: yes I think I 
covered that this is really unlikely to test much.  However I have seen it hit 
the G1 old gen at the right time, where it observes 0 usage, makes the 
allocation, and then presumably there's been a GC and that gen has significant 
usage and the threshold is reached.

        
5 pool java.lang:name=G1 Old Gen,type=MemoryPool of type: Heap memory
  supports usage thresholds
     used value is 0      max is 31675383808 isExceeded = false
  threshold set to 1
  threshold count  0
  reset peak usage. peak usage = 0 isExceeded = false
  Allocated heap. isExceeded = true
     used value is 16740352      max is 31675383808 isExceeded = true
peak used value is 16740352 peak max is 31675383808

Not claiming that makes it a great test,  but I would like to get it back in 
circulation, so we can find out if it causes more noise, and can consider 
whether it is worth keeping or reworking further.

-------------

PR: https://git.openjdk.org/jdk/pull/9309

Re: RFR: 8198668: MemoryPoolMBean/isUsageThresholdExceeded/isexceeded001/TestDescription.java still failing [v2]

Reply via email to