On Mon, 22 May 2023 10:22:16 GMT, Volker Simonis <simo...@openjdk.org> wrote:
>> Since JDK13, executing commands in a sub-process defaults to the so called >> `POSIX_SPAWN` launching mechanism (i.e. >> `-Djdk.lang.Process.launchMechanism=POSIX_SPAWN`) on Linux. This works by >> using `posix_spawn(3)` to firstly start a tiny helper program called >> `jspawnhelper` in a subprocess. In a second step, `jspawnhelper` reads the >> command data from the parent Java process over a Unix pipe and finally >> executes (i.e. `execvp(3)`) the requested command. >> >> In cases where the parent process terminates abnormally before >> `jspawnhelper` has read all the expected data from the pipe, `jspawnhelper` >> will block indefinitely on the reading end of the pipe. This is especially >> harmful if the parent process had open sockets, because in that case, the >> forked `jspawnhelper` process will inherit them and keep all the >> corresponding ports open effectively preventing other processes to bind to >> them. Notice that this is not an abstract scenario. We've observed this >> regularly in production with services which couldn't be restarted after a >> crash after migrating to JDK 17. >> >> The fix of the issue is rather trivial. `jspawnhelper` has to close its >> writing end of the pipe which connects it with the parent Java process >> *before* starting to read from that pipe such that reads from the pipe can >> immediately return with EOF if the parent process terminates abnormally. >> >> Also did some cleanup: >> - in `jspawnhelper.c` correctly chek the result of `readFully()` >> - in `ProcessImpl_md.c` use `restartableWrite()` instead of `write()` to >> write the command data from the parent process to `jspawnhelper` >> - in `childproc.{c,h}` remove remaining traces of `MODE_CLONE` because >> that's long gone. > > Volker Simonis has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains five commits: > > - Merge branch 'master' into JDK-8307990 > - Address comments from tstuefe and RogerRiggs > - REALLY adding the test :) > - Added test case > - 8307990: jspawnhelper must close its writing side of a pipe before reading > from it Looks good. Please enable for muslc (see remark). Otherwise, all other remarks are nits and I leave it up to you whether you accept them. I don't need another review. src/java.base/unix/native/jspawnhelper/jspawnhelper.c line 148: > 146: r = sscanf (argv[argc-1], "%d:%d:%d", &fdinr, &fdinw, &fdout); > 147: if (r == 3 && fcntl(fdinr, F_GETFD) != -1 && fcntl(fdinw, F_GETFD) > != -1) { > 148: fstat(fdinr, &buf); Preexisting, and possibly for another RFE: - why don't we test fdout as well? - we probably could make sure the output fds are valid for us (S_ISREG | S_ISFIFO | (possibly?) S_ISSOCK) test/jdk/java/lang/ProcessBuilder/JspawnhelperProtocol.java line 28: > 26: * @test > 27: * @bug 8307990 > 28: * @requires (os.family == "linux" & !vm.musl) Muslc supports posix_spawn. I tested your patch on Alpine, it works and the test works too. Please enable for musl. test/jdk/java/lang/ProcessBuilder/JspawnhelperProtocol.java line 92: > 90: } > 91: } > 92: Small nits, mainly to make this test easier to understand to the casual reader: - consistent naming: we have "simulateCrashInChild" and "simulateCrashInParent", good, but then we have "childCode", which is actually the code executed in the parent, right? - simulateCrashInChild and simulateCrashInParent could be merged into one function that does: - spawn parent with env var - read output, scan for `"posix_spawn:XXX"` - parent must have exit code != 0 and should not have timed out - if XXX is not 0, check grandchild pid (must not be a live process) ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13956#pullrequestreview-1438535260 PR Review Comment: https://git.openjdk.org/jdk/pull/13956#discussion_r1201539615 PR Review Comment: https://git.openjdk.org/jdk/pull/13956#discussion_r1201548057 PR Review Comment: https://git.openjdk.org/jdk/pull/13956#discussion_r1201770607