On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe <stu...@openjdk.org> wrote:
>> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I >> closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated >> too much comment history and got confusing. For a history of this issue, see >> previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can >> recover a significant memory footprint if the VM process suffers from >> high-but-rare malloc spikes. It does not matter who causes the spikes: the >> JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes >> often carry over as permanent RSS increase. Note that C-heap retention is >> difficult to observe. Since it is freed memory, it won't appear in NMT; it >> is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It >> makes a lot of sense with applications that cause high traffic on the >> C-heap. The JVM, however, clusters allocations and often rolls its own >> memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 >> [2], we added a new jcmd command to *manually* trim the C-heap on Linux >> (`jcmd System.trim_native_heap`). We then observed customers running this >> command periodically to slim down process sizes of container-bound jvms. >> That is cumbersome, and the JVM can do this a lot better - among other >> things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and >> experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had >> they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in >> that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the >> heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. >> "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with >> typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional > commit since the last revision: > > last cleanups and shade feedback Another read. I think we would need to tighten up style a bit too: the newline spacing style is different across the change. src/hotspot/share/runtime/globals.hpp line 1990: > 1988: "If TrimNativeHeap is enabled: interval, in ms, at which " > \ > 1989: "the GC will attempt to trim the native heap.") > \ > 1990: range(100, UINT_MAX) > \ Still mentions "GC". Should we accept 1ms as valid interval? 100ms might be too big for short-running workloads. Very aggressive trim at 1ms would also be useful for stress-testing trim code. src/hotspot/share/runtime/trimNative.cpp line 44: > 42: Monitor* const _lock; > 43: bool _stop; > 44: unsigned _suspend_count; `uint16_t`, maybe? src/hotspot/share/runtime/trimNative.cpp line 49: > 47: uint64_t _num_trims_performed; > 48: > 49: bool suspended() const { Style: `is_suspended()`. src/hotspot/share/runtime/trimNative.cpp line 66: > 64: } > 65: > 66: bool stopped() const { `is_stopped()`? src/hotspot/share/runtime/trimNative.cpp line 75: > 73: SafepointSynchronize::is_at_safepoint() || > 74: SafepointSynchronize::is_synchronizing(); > 75: } Suggestion: bool at_or_nearing_safepoint() const { return SafepointSynchronize::is_at_safepoint() || SafepointSynchronize::is_synchronizing(); } src/hotspot/share/runtime/trimNative.cpp line 84: > 82: run_inner(); > 83: log_info(trim)("NativeTrimmer stop."); > 84: } Do we need this logging? We can simplify and just inline `run_inner` here. src/hotspot/share/runtime/trimNative.cpp line 99: > 97: > 98: if (trim_result) { > 99: _num_trims_performed++; Simplification: let's just use `Atomic::*` for `_num_trims_performed`, and this `trim_result` dance (which I think is only needed to get the update under lock?) is not needed. src/hotspot/share/runtime/trimNative.cpp line 149: > 147: return true; > 148: } else { > 149: log_info(trim)("Trim native heap (no details)"); Consistency: `Trim native heap: complete, no details`. src/hotspot/share/runtime/trimNative.cpp line 177: > 175: // No need to wakeup trimmer > 176: } > 177: log_debug(trim)("NativeTrimmer pause (%s) (%u)", reason, n); Suggestion: log_debug(trim)("NativeTrimmer pause for %s (%u suspend requests)", reason, n); src/hotspot/share/runtime/trimNative.cpp line 190: > 188: } > 189: } > 190: log_debug(trim)("NativeTrimmer unpause (%s) (%u)", reason, n); Suggestion: log_debug(trim)("NativeTrimmer unpause for %s (%u suspend requests)", reason, n); src/hotspot/share/runtime/trimNative.hpp line 27: > 25: */ > 26: > 27: #ifndef SHARE_GC_SHARED_TRIMNATIVE_HPP This is now `SHARE_RUNTIME_TRIMNATIVE_HPP`. test/hotspot/jtreg/runtime/os/TestTrimNative.java line 243: > 241: } > 242: > 243: if (args[0].equals("RUN")) { The usual trick is to pull this "internal" main into a class, and then reference it. Like: public Test { public void test() { runWith("...", "Test$Workload"); } public class Workload { public static void main(String... args) {} } } test/hotspot/jtreg/runtime/os/TestTrimNative.java line 247: > 245: System.out.println("Will spike now..."); > 246: for (int i = 0; i < numAllocations; i++) { > 247: ptrs[i] = > Unsafe.getUnsafe().allocateMemory(szAllocations); Pull `Unsafe.getUnsafe()` into a local or a field? test/hotspot/jtreg/runtime/os/TestTrimNative.java line 270: > 268: runTest(Arrays.copyOfRange(args, 1, args.length)); > 269: } else if (args[0].equals("testOffOnNonCompliantPlatforms")) { > 270: testOffOnNonCompliantPlatforms(); Inline these `test*`? This would give some symmetry against `runTest`. ------------- PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1516837594 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254621393 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254632878 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254622666 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254665880 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254633258 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254639542 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254653161 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254657016 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678136 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678539 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254749047 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254746310 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254743705 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254744410