On Tue, 28 Jun 2022 11:43:42 GMT, Kevin Walls <kev...@openjdk.org> wrote:
>> Test has been problemlisted for a long time due to intermittent failures. >> >> This is a difficult test as it tries to monitor usage thresholds on Memory >> Pools which are outside its control. >> Not just Java heap pools, where the allocation it makes may or may not >> affect a particuclar pool, but non-heap pools such as CodeHeap and Metadata, >> where other activity in the VM can affect their usage and surprise the test. >> >> The test iterates JMX memory pools where thresholds are supported, sets a >> threshold one byte higher than current usage, and makes an allocation. This >> only makes sense on Java heap pools. It is tempting to skip non-heap pools, >> but this test can still give a sanity test about threshold behaviour. That >> is actually its main purpose, as the allocation is unlikely to affect the >> pool being tested. >> >> With the changes here, I'm seeing the test and all its variations pass >> reliably, i.e. 50 iterations in each tested platform. >> >> Skip testing a non-heap memory pool, e.g. CodeHeap, if it is hitting the >> threshold while we test, because that means it is changing outside our >> control. Also re-test isExceeded on failure, as fetching the usage and >> isExceeded is a race. >> >> Logging of more pool stats to better understand failures. > > Kevin Walls has updated the pull request incrementally with one additional > commit since the last revision: > > Show log output > Thanks Thomas - It's not a great test. 8-) It is an old test, and has been problemlisted for a long time. That means it isn't run all the time, but does get run, and can fail, so causes bug reports and takes up people's time. I went with making it more robust, in that I used to be able to see false positives frequently, and now I can see none. If there are future failures, I would revisit. Yes using peak usage is good, it already did do that, but it was confusing when it prints the result of monitor.getPeakUsage(), then makes a comparison by calling monitor.getPeakUsage() again - you may not get the same value so what we log and what we compare aren't the same. I try to avoid the races in this far from ideal test, and noticing when the pool is changing outside our control avoids false failures in CodeHeaps which I was seeing frequently. The small allocation and then checking if thresholds are reached: yes I think I covered that this is really unlikely to test much. However I have seen it hit the G1 old gen at the right time, where it observes 0 usage, makes the allocation, and then presumably there's been a GC and that gen has significant usage and the threshold is reached. 5 pool java.lang:name=G1 Old Gen,type=MemoryPool of type: Heap memory supports usage thresholds used value is 0 max is 31675383808 isExceeded = false threshold set to 1 threshold count 0 reset peak usage. peak usage = 0 isExceeded = false Allocated heap. isExceeded = true used value is 16740352 max is 31675383808 isExceeded = true peak used value is 16740352 peak max is 31675383808 Not claiming that makes it a great test, but I would like to get it back in circulation, so we can find out if it causes more noise, and can consider whether it is worth keeping or reworking further. ------------- PR: https://git.openjdk.org/jdk/pull/9309