On Thu, 5 Sep 2024 13:25:17 GMT, Severin Gehwolf <sgehw...@openjdk.org> wrote:

>> Please review this fix for cgroups-based metrics reporting in the 
>> `jdk.internal.platform` package. This fix is supposed to address wrong 
>> reporting of certain limits if the limits aren't set at the leaf nodes.
>> 
>> For example, on cg v2, the memory limit interface file is `memory.max`. 
>> Consider a cgroup path of  `/a/b/c/d`. The current code only reports the 
>> limits (via Metrics) correctly if it's set at `/a/b/c/d/memory.max`. 
>> However, some systems - like a systemd slice - sets those limits further up 
>> the hierarchy. For example at `/a/b/c/memory.max`. `/a/b/c/d/memory.max` 
>> might be set to the value `max` (for unlimited), yet `/a/b/c/memory.max` 
>> would report the actual limit value (e.g. `1048576000`).
>> 
>> This patch addresses this issue by:
>> 
>> 1. Refactoring the interface lookup code to relevant controllers for 
>> cpu/memory. The CgroupSubsystem classes then delegate to those for the 
>> lookup. This facilitates having an API for the lookup of an updated limit in 
>> step 2.
>> 2. Walking the full hierarchy of the cgroup path (if any), looking for a 
>> lower limit than at the leaf. Note that it's not possible to raise the limit 
>> set at a path closer to the root via the interface file at a 
>> further-to-the-leaf-level. The odd case out seems to be `max` values on some 
>> systems (which seems to be the default value).
>> 
>> As an optimization this hierarchy walk is skipped on containerized systems 
>> (like K8S), where the limits are set in interface files at the leaf nodes of 
>> the hierarchy. Therefore there should be no change on those systems.
>> 
>> This patch depends on the Hotspot change implementing the same for the JVM 
>> so that `Metrics.isContainerized()` works correctly on affected systems 
>> where `-XshowSettings:system` currently reports `System not containerized` 
>> due to the missing JVM fix. A test framework for such hierarchical systems 
>> has been proposed in 
>> [JDK-8333446](https://bugs.openjdk.org/browse/JDK-8333446). This patch adds 
>> a test using that framework among some simpler unit tests.
>> 
>> Thoughts?
>> 
>> Testing:
>> 
>> - [x] GHA
>> - [x] Container tests on Linux x86_64 on cg v1 and cg v2 systems
>> - [x] Some manual testing using systemd slices
>
> Severin Gehwolf has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains eight additional 
> commits since the last revision:
> 
>  - Add JVM_HostActiveProcessorCount using JVM api
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Adjust test and fix code to return lowest hierarchical limit
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - Add root check for SystemdMemoryAwarenessTest
>  - Merge branch 'jdk-8322420_cgroup_hierarchy_walk_init' into 
> jdk-8336881-metrics-systemd-slice
>  - 8336881: [Linux] Support for hierarchical limits for Metrics

Thanks. I forgot to update said tests after the latest update. Fixing it now.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20280#issuecomment-2340949763

Reply via email to