On Mon, 30 Sep 2024 12:03:14 GMT, Severin Gehwolf <sgehw...@openjdk.org> wrote:
>> Please review this fix for cgroups-based metrics reporting in the >> `jdk.internal.platform` package. This fix is supposed to address wrong >> reporting of certain limits if the limits aren't set at the leaf nodes. >> >> For example, on cg v2, the memory limit interface file is `memory.max`. >> Consider a cgroup path of `/a/b/c/d`. The current code only reports the >> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. >> However, some systems - like a systemd slice - sets those limits further up >> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` >> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` >> would report the actual limit value (e.g. `1048576000`). >> >> This patch addresses this issue by: >> >> 1. Refactoring the interface lookup code to relevant controllers for >> cpu/memory. The CgroupSubsystem classes then delegate to those for the >> lookup. This facilitates having an API for the lookup of an updated limit in >> step 2. >> 2. Walking the full hierarchy of the cgroup path (if any), looking for a >> lower limit than at the leaf. Note that it's not possible to raise the limit >> set at a path closer to the root via the interface file at a >> further-to-the-leaf-level. The odd case out seems to be `max` values on some >> systems (which seems to be the default value). >> >> As an optimization this hierarchy walk is skipped on containerized systems >> (like K8S), where the limits are set in interface files at the leaf nodes of >> the hierarchy. Therefore there should be no change on those systems. >> >> This patch depends on the Hotspot change implementing the same for the JVM >> so that `Metrics.isContainerized()` works correctly on affected systems >> where `-XshowSettings:system` currently reports `System not containerized` >> due to the missing JVM fix. A test framework for such hierarchical systems >> has been added in >> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds >> a test using that framework among some simpler unit tests. >> >> Thoughts? >> >> Testing: >> >> - [x] GHA >> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems >> - [x] Some manual testing using systemd slices > > Severin Gehwolf has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 34 commits: > > - Add exclusive access dirs directive for systemd tests > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8336881-metrics-systemd-slice > - Improve systemd slice test for non-root on cg v2 > - Fix unit tests > - Add JVM_HostActiveProcessorCount using JVM api > - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into > jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init > - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into > jdk-8336881-metrics-systemd-slice > - Merge branch 'master' into jdk-8322420_cgroup_hierarchy_walk_init > - ... and 24 more: https://git.openjdk.org/jdk/compare/475b8943...92426dbf Anyone? Happy to answer any questions you might have :) Testing run on cgroups v2 with this: Passed: containers/cgroup/CgroupSubsystemFactory.java Passed: containers/cgroup/TestContainerized.java Passed: containers/docker/DockerBasicTest.java Passed: containers/docker/ShareTmpDir.java Passed: containers/docker/TestContainerInfo.java Passed: containers/docker/TestCPUAwareness.java Passed: containers/docker/TestCPUSets.java Passed: containers/docker/TestJcmd.java Passed: containers/docker/TestJcmdWithSideCar.java Passed: containers/docker/TestJFREvents.java Passed: containers/docker/TestJFRNetworkEvents.java Passed: containers/docker/TestJFRWithJMX.java Passed: containers/docker/TestLimitsUpdating.java Passed: containers/docker/TestMemoryAwareness.java Passed: containers/docker/TestMemoryWithCgroupV1.java Passed: containers/docker/TestMisc.java Passed: containers/docker/TestPids.java Passed: containers/systemd/SystemdMemoryAwarenessTest.java Passed: jdk/internal/platform/cgroup/CgroupSubsystemCpuControllerTest.java Passed: jdk/internal/platform/cgroup/CgroupSubsystemMemoryControllerTest.java Passed: jdk/internal/platform/cgroup/CgroupV1SubsystemControllerTest.java Passed: jdk/internal/platform/cgroup/CgroupV2SubsystemControllerTest.java Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemFactory.java Passed: jdk/internal/platform/cgroup/TestSystemSettings.java Passed: jdk/internal/platform/docker/TestDockerBasic.java Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java Passed: jdk/internal/platform/docker/TestGetFreeSwapSpaceSize.java Passed: jdk/internal/platform/docker/TestLimitsUpdating.java Passed: jdk/internal/platform/docker/TestPidsLimit.java Passed: jdk/internal/platform/docker/TestSystemMetrics.java Passed: jdk/internal/platform/docker/TestUseContainerSupport.java Passed: jdk/internal/platform/systemd/SystemdMemoryAwarenessTest.java Test results: passed: 35 ------------- PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2382997765