On Mon, 22 May 2023 10:22:16 GMT, Volker Simonis <simo...@openjdk.org> wrote:

>> Since JDK13, executing commands in a sub-process defaults to the so called 
>> `POSIX_SPAWN` launching mechanism (i.e. 
>> `-Djdk.lang.Process.launchMechanism=POSIX_SPAWN`) on Linux. This works by 
>> using `posix_spawn(3)` to firstly start a tiny helper program called 
>> `jspawnhelper` in a subprocess. In a second step, `jspawnhelper` reads the 
>> command data from the parent Java process over a Unix pipe and finally 
>> executes (i.e. `execvp(3)`) the requested command.
>> 
>> In cases where the parent process terminates abnormally before 
>> `jspawnhelper` has read all the expected data from the pipe, `jspawnhelper` 
>> will block indefinitely on the reading end of the pipe. This is especially 
>> harmful if the parent process had open sockets, because in that case, the 
>> forked `jspawnhelper` process will inherit them and keep all the 
>> corresponding ports open effectively preventing other processes to bind to 
>> them. Notice that this is not an abstract scenario. We've observed this 
>> regularly in production with services which couldn't be restarted after a 
>> crash after migrating to JDK 17.
>> 
>> The fix of the issue is rather trivial. `jspawnhelper` has to close its 
>> writing end of the pipe which connects it with the parent Java process 
>> *before* starting to read from that pipe such that reads from the pipe can 
>> immediately return with EOF if the parent process terminates abnormally.
>> 
>> Also did some cleanup:
>>  - in `jspawnhelper.c` correctly chek the result of `readFully()`
>>  - in `ProcessImpl_md.c` use `restartableWrite()` instead of `write()` to 
>> write the command data from the parent process to `jspawnhelper`
>>  - in `childproc.{c,h}` remove remaining traces of `MODE_CLONE` because 
>> that's long gone.
>
> Volker Simonis has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains five commits:
> 
>  - Merge branch 'master' into JDK-8307990
>  - Address comments from tstuefe and RogerRiggs
>  - REALLY adding the test :)
>  - Added test case
>  - 8307990: jspawnhelper must close its writing side of a pipe before reading 
> from it

Looks good. 

Please enable for muslc (see remark). Otherwise, all other remarks are nits and 
I leave it up to you whether you accept them. I don't need another review.

src/java.base/unix/native/jspawnhelper/jspawnhelper.c line 148:

> 146:     r = sscanf (argv[argc-1], "%d:%d:%d", &fdinr, &fdinw, &fdout);
> 147:     if (r == 3 && fcntl(fdinr, F_GETFD) != -1 && fcntl(fdinw, F_GETFD) 
> != -1) {
> 148:         fstat(fdinr, &buf);

Preexisting, and possibly for another RFE:
- why don't we test fdout as well?
- we probably could make sure the output fds are valid for us (S_ISREG | 
S_ISFIFO | (possibly?) S_ISSOCK)

test/jdk/java/lang/ProcessBuilder/JspawnhelperProtocol.java line 28:

> 26:  * @test
> 27:  * @bug 8307990
> 28:  * @requires (os.family == "linux" & !vm.musl)

Muslc supports posix_spawn.

I tested your patch on Alpine, it works and the test works too. Please enable 
for musl.

test/jdk/java/lang/ProcessBuilder/JspawnhelperProtocol.java line 92:

> 90:         }
> 91:     }
> 92: 

Small nits, mainly to make this test easier to understand to the casual reader:
- consistent naming: we have "simulateCrashInChild" and 
"simulateCrashInParent", good, but then we have "childCode", which is actually 
the code executed in the parent, right?
- simulateCrashInChild and simulateCrashInParent could be merged into one 
function that does:
  - spawn parent with env var
  - read output, scan for `"posix_spawn:XXX"`
  - parent must have exit code != 0 and should not have timed out
  - if XXX is not 0, check grandchild pid (must not be a live process)

-------------

Marked as reviewed by stuefe (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/13956#pullrequestreview-1438535260
PR Review Comment: https://git.openjdk.org/jdk/pull/13956#discussion_r1201539615
PR Review Comment: https://git.openjdk.org/jdk/pull/13956#discussion_r1201548057
PR Review Comment: https://git.openjdk.org/jdk/pull/13956#discussion_r1201770607

Reply via email to