On Tue, 12 Jul 2022 22:39:36 GMT, Ioi Lam <ik...@openjdk.org> wrote:

>> Some Kubernetes setups share the /tmp directory across multiple containers. 
>> On rare occasions, the JVM may crash when it tries to write to 
>> `/tmp/hsperfdata_<user>/<pid>` when a process in a separate container 
>> decides to do the same thing (because they happen to have the same 
>> namespaced pid).
>> 
>> This patch avoids the crash by using `flock()` to allow only one of these 
>> processes to write to the file. All other competing processes that fail to 
>> grab the lock will give up the file and run with PerfMemory disabled. We 
>> will try to enable PerfMemory for the failed processes in a follow-up RFE: 
>> [JDK-8289883](https://bugs.openjdk.org/browse/JDK-8289883)
>> 
>> Thanks to Vitaly Davidovich and Nico Williams for coming up with the idea of 
>> using `flock()`.
>> 
>> I kept the use of `kill()` for stale file detection to be compatible with 
>> older JVMs.
>> 
>> I also took the opportunity to clean up the comments and remove dead code. 
>> The old code was using "shared memory resources" which sounds unclear and 
>> odd. I changed the terminology to say "shared memory file" instead.
>
> Ioi Lam has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - add errno to log
>  - added debug log and tweaked comment

LGTM. My manual tests of this work as expected as well.


$ podman run --rm -ti --userns=keep-id -u $(id -u) -v $(pwd)/shared-tmp:/tmp:z 
-v 
/disk/openjdk/upstream-sources/git/jdk-jdk/build/linux-x86_64-server-release/images/jdk:/opt/jdk:z
 -v $(pwd)/test:/opt/test:z fedora:36 /opt/jdk/bin/java -Xlog:perf+memops=debug 
-cp /opt/test HelloWait
[0.001s][debug][perf,memops] PerfDataMemorySize = 32768, 
os::vm_allocation_granularity = 4096, adjusted size = 32768
[0.001s][info ][perf,memops] Trying to open /tmp/hsperfdata_sgehwolf/1
[0.001s][info ][perf,memops] Successfully opened
[0.001s][debug][perf,memops] PerfMemory created: address = 0x00007fac290dd000, 
size = 32768
Hello!
$ podman run --rm -ti --userns=keep-id -u $(id -u) -v $(pwd)/shared-tmp:/tmp:z 
-v 
/disk/openjdk/upstream-sources/git/jdk-jdk/build/linux-x86_64-server-release/images/jdk:/opt/jdk:z
 -v $(pwd)/test:/opt/test:z fedora:36 /opt/jdk/bin/java -Xlog:perf+memops=debug 
-cp /opt/test HelloWait
[0.001s][debug][perf,memops] PerfDataMemorySize = 32768, 
os::vm_allocation_granularity = 4096, adjusted size = 32768
[0.001s][debug][perf,memops] flock for stale file check failed for 
/tmp/hsperfdata_sgehwolf/1
[0.001s][info ][perf,memops] Trying to open /tmp/hsperfdata_sgehwolf/1
[0.001s][warning][perf,memops] Cannot use file /tmp/hsperfdata_sgehwolf/1 
because it is locked by another process (errno = 11)
[0.001s][debug  ][perf,memops] PerfMemory created: address = 
0x00007fc60bc79000, size = 32768
Hello!

-------------

Marked as reviewed by sgehwolf (Reviewer).

PR: https://git.openjdk.org/jdk/pull/9406

Reply via email to