> While working on improving the TLAB sizing code for ZGC @kstefanj ran into an > issue with the following tests failing: > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorInterpreterObjectTest.java > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java > > The reason for seeing the problems now is that with the new sizing code ZGC > used smaller TLABs at first, before resizing them to a proper size (to lower > the waste). In the HeapMonitor tests we don't allocate enough to trigger GCs > that will resize the TLABs so most of the tests will now run with small > TLABs. This should not be a problem, but it turns out the current sampling > code is not working properly when you get a lot of outside TLAB allocations. > You get those when trying to allocate a fairly large object (~1400B) that > won't fit into the TLAB, but there are still quite a bit of room in the TLAB > so we don't want to throw it away and take a new one. > > The problem in the current code is that we keep track of when to sample with > multiple variables and when getting out of TLAB allocations these get out of > sync. > > The proposed patch is the result of a restructuring and fixing of the the > code that me and @kstefanj did to fix this issue. > > The idea is to better account how much we have allocated in three different > buckets: > * Outside of TLAB allocations > * Accounted TLAB allocations > * The last bit of TLAB allocations that hasn't been accounted yet > > And then use the sum of that to compare against a *non-changing* threshold to > see if it is time to take a sample. > > There are a few things to think about when reading this code: > * The thread can allocate and retire multiple TLABs before we cross the > sample threshold. > * The sampling can take multiple samples in a single TLAB > * Any overshooting of the sample threshold triggers only one sample and the > extra bytes are ignored when checking for the next sample. > > There are some restructuring in the patch to confine the ThreadHeapSampler > variables and code. For example: > > 1) Moved accounting variables out of TLAB and into the ThreadHeapSampler > 2) Moved thread allocation accounting and sampling code out of the TLAB > 3) Moved retiring of TLABs out of make_parseable (needed to support (2)) > > Some of that could be extracted into a separate PR if that's deemed > worthwhile. > > Tested with the HeapMonitor tests various TLAB sizes. > > If there are anyone using these APIs it would be nice to get feedback if > these changes work well for you.
Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Re-enable tests after merge - Merge remote-tracking branch 'upstream/master' into 8356372_thread_heap_sampler - 8356372: JVMTI heap sampling not working properly with outside TLAB allocations ------------- Changes: - all: https://git.openjdk.org/jdk/pull/25114/files - new: https://git.openjdk.org/jdk/pull/25114/files/ed0c412d..0485cd2f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=25114&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=25114&range=00-01 Stats: 75811 lines in 2330 files changed: 52604 ins; 13809 del; 9398 mod Patch: https://git.openjdk.org/jdk/pull/25114.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/25114/head:pull/25114 PR: https://git.openjdk.org/jdk/pull/25114