Re: RFR: 8286030: Avoid JVM crash when containers share the same /tmp dir [v2]

Thomas Stuefe Fri, 08 Jul 2022 08:28:54 -0700

On Thu, 7 Jul 2022 21:58:55 GMT, Ioi Lam <ik...@openjdk.org> wrote:

>> src/hotspot/os/posix/perfMemory_posix.cpp line 781:
>> 
>>> 779:     // signal the process, then the file is assumed to
>>> 780:     // be stale and is removed because the files for such a
>>> 781:     // process should be in a different user specific directory.
>> 
>> I am not sure this is good. If two conflicting hotspots share the same PID 
>> in tmp, from two different users, this will probably be a setup error. Is 
>> the best way really to provoke SIGBUS in the other VM? Seems a bit harsh. 
>> 
>> Also terminology would be wrong. Its not stale then, since the target 
>> process probably exists, is a VM, and uses that file.
>
> We will get a permission error from the `kill(pid, 0)` call only after we 
> have successfully grabbed the flock on the file. Note that if the file was 
> created by a live JVM process that has the flock fix (i.e., this PR), 
> regardless of which user owns the process, we will never come to here.
> 
> That the value of the `pid` variable is misleading. It is the NSPID of 
> another JVM that created the file. If the current JVM process runs in a 
> different PID namespace, it cannot reliably determine whether the file is 
> stale or not.
> 
> In general, I don't think we can trust `pid` at all when containers are 
> involved. But that's OK -- if you want to use Java in containers that share 
> the /tmp directory, you must upgrade to a JVM that has the flock fix. 
> Otherwise the behavior is undefined.
> 
> Otherwise, if you are:
> 
> - Not using containers. or
> - Using containers that don't share /tmp
> 
> The logic for handling the `kill(pid, 0)` error is not changed by this PR, so 
> we are bug-for-bug compatible with older JVMs. If you think the behavior 
> should be changed, may that should be done in a separate PR?
> 
> Or, if you run into problems like "my hsperf files are randomly deleted", a 
> simple fix is to upgrade the JVM to one that has the flock fix :-)


Okay, you convinced me.

-------------

PR: https://git.openjdk.org/jdk/pull/9406

Re: RFR: 8286030: Avoid JVM crash when containers share the same /tmp dir [v2]

Reply via email to