On Thu, 12 Dec 2024 01:11:33 GMT, Sergey Chernyshev <schernys...@openjdk.org> wrote:
>> Cgroup V1 subsustem fails to initialize mounted controllers properly in >> certain cases, that may lead to controllers left undetected/inactive. We >> observed the behavior in CloudFoundry deployments, it affects also host >> systems. >> >> The relevant /proc/self/mountinfo line is >> >> >> 2207 2196 0:43 >> /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c >> /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - >> cgroup cgroup rw,cpu,cpuacct >> >> >> /proc/self/cgroup: >> >> >> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c >> >> >> Here, Java runs inside containerized process that is being moved cgroups due >> to load balancing. >> >> Let's examine the condition at line 64 here >> https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72 >> It is always FALSE and the branch is never taken. The issue was spotted >> earlier by @jerboaa in >> [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). >> >> The original logic was intended to find the common prefix of `_root`and >> `cgroup_path` and concatenate the remaining suffix to the `_mount_point` >> (lines 67-68). That could lead to the following results: >> >> Example input >> >> _root = "/a" >> cgroup_path = "/a/b" >> _mount_point = "/sys/fs/cgroup/cpu,cpuacct" >> >> >> result _path >> >> "/sys/fs/cgroup/cpu,cpuacct/b" >> >> >> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page >> (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control >> groups states: >> >> >> ... >> /proc/pid/cgroup (since Linux 2.6.24) >> This file describes control groups to which the process >> with the corresponding PID belongs. The displayed >> information differs for cgroups version 1 and version 2 >> hierarchies. >> For each cgroup hierarchy of which the process is a >> member, there is one entry containing three colon- >> separated fields: >> >> hierarchy-ID:controller-list:cgroup-path >> >> For example: >> >> 5:cpuacct,cpu,cpuset:/daemons >> ... >> [3] This field contains the pathname of the control group >> in the hierarchy to which the process belongs. This >> pathname is relative to the mount point of the >> hierarchy. >> >> >> This explicitly states the "pathname is relative t... > > Sergey Chernyshev has updated the pull request incrementally with one > additional commit since the last revision: > > added details in the log message `CgroupV1Controller::set_subsystem_path` needs high level comment update to describe the logic happening. Testing: >>And after the patch this would become this, right? >>``` >>/sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c >>/sys/fs/cgroup/cpu,cpuacct/ >>``` > >It depends on whether it was a subgroup in the initial path. If >bad/2f57368b-0eda-4e52-64d8-af5c is the subgroup, the reduction will be > >``` >/sys/fs/cgroup/cpu,cpuacct/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c >/sys/fs/cgroup/cpu,cpuacct/bad/2f57368b-0eda-4e52-64d8-af5c >/sys/fs/cgroup/cpu,cpuacct/bad >/sys/fs/cgroup/cpu,cpuacct/ >``` The above case, doesn't seem to be reflected by any gtest test case (or others), please add those. src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 75: > 73: break; > 74: } > 75: suffix = strchr(suffix+1, '/'); Style: Suggestion: suffix = strchr(suffix + 1, '/'); src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java line 63: > 61: Path cgp = Path.of(cgroupPath); > 62: int nameCount = cgp.getNameCount(); > 63: for (int i=0; i<nameCount; ++i) { Style: Suggestion: for (int i=0; i < nameCount; i++) { src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java line 73: > 71: break; > 72: } > 73: cgp = (cgp.getNameCount() > 1) ? > cgp.subpath(1, cgp.getNameCount()) : Path.of(""); Suggestion: cgp = (nameCount > 1) ? cgp.subpath(1, nameCount) : Path.of(""); test/hotspot/jtreg/containers/docker/TestMemoryWithSubgroups.java line 60: > 58: > 59: Common.prepareWhiteBox(); > 60: DockerTestUtils.buildJdkContainerImage(imageName); This can be moved out of the condition. It's there in both cases. test/hotspot/jtreg/containers/docker/TestMemoryWithSubgroups.java line 85: > 83: } else { > 84: System.out.println("Metrics are from neither cgroup v1 nor > v2, skipped for now."); > 85: } Suggestion: throw new RuntimeException("cgroup version not known? was: " + metrics.getProvider()); test/hotspot/jtreg/containers/docker/TestMemoryWithSubgroups.java line 98: > 96: opts.addDockerOpts("--privileged") > 97: .addDockerOpts("--cgroupns=" + (privateNamespace ? "private" > : "host")) > 98: .addDockerOpts("--memory", "1g"); Please extract this "larger" memory limit out of this function. The expectation is to have: Container memory limit => X Some lower limit in a moved path => Y Expect that the lower limit of Y is being detected. For example: testMemoryLimitSubgroupV1("1g", "100m", "104857600", false); test/hotspot/jtreg/containers/docker/TestMemoryWithSubgroups.java line 118: > 116: opts.addDockerOpts("--privileged") > 117: .addDockerOpts("--cgroupns=" + (privateNamespace ? "private" > : "host")) > 118: .addDockerOpts("--memory", "1g"); Same here with the `1g` container memory limit. Move it to a parameter. test/jdk/jdk/internal/platform/docker/TestDockerMemoryMetricsSubgroup.java line 72: > 70: } else { > 71: System.out.println("Metrics are from neither cgroup v1 nor > v2, skipped for now."); > 72: } Please `throw` here. ------------- Changes requested by sgehwolf (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/21808#pullrequestreview-2620718790 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958032435 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958043532 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958048577 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958073462 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958075672 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958059439 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958061783 PR Review Comment: https://git.openjdk.org/jdk/pull/21808#discussion_r1958077572