On Mon, 13 Nov 2023 10:11:26 GMT, Long Yang <d...@openjdk.org> wrote:
>> I would like to fix this. >> >> Create 4096 threads, and the stack depth of each thread is 256. >> After running jmx.dumpAllThreads(true, true), the RSS reaches 5.3GiB. >> After optimization, the RSS is 250MiB. >> >> I would appreciate it if anyone could review this. >> >> --------- >> update >> >> If the number of `threads` and `stack depth` are relatively large, we need >> to apply for more space in `ResourceArea` during the execution of >> `jmx.dumpAllThreads(true, true)`. >> >> The reason is that `VM_ThreadDump::doit` creates `vframe` for each `frame` >> of each `thread`. >> https://github.com/openjdk/jdk/blob/fe0ccdf5f8a5559a608d2e2cd2b6aecbe245c5ec/src/hotspot/share/services/threadService.cpp#L704 >> sizeof `vframe` is 4808 (bytes), and sizeof `compiledVFrame` is 4824 >> (bytes), mainly because the `xmm registers` in `RegisterMap` are relatively >> large. Assuming there are 4096 `threads` and each `thread` has 256 `frames`, >> the memory required is 4096 * 256 * 4824 = 4.7GiB。 >> >> These memories of all threads are released once by the the initial >> `ResourceMark` of `VM_ThreadDump::doit`. >> https://github.com/openjdk/jdk/blob/fe0ccdf5f8a5559a608d2e2cd2b6aecbe245c5ec/src/hotspot/share/runtime/vmOperations.cpp#L269 >> My solution is to add a `ResourceMark` for each thread. > > Long Yang has updated the pull request incrementally with one additional > commit since the last revision: > > Use VMThread::vm_thread() to avoid the need to call Thread::current() Hi I ran `tier1`, `tier2`, `tier3`, `tier4` on my host machine. `tier1`, `tier2`, and `tier3` all passed. Because my host does not have a display device, I added `export JTREG_KEYWORDS="!headful"` before running `tier4`. Finally, some tests in `tier4` that depend on the printing device failed, and the rest were successful. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16598#issuecomment-1813661550