On Mon, 7 Oct 2024 19:37:51 GMT, Sebastian Lövdahl <d...@openjdk.org> wrote:
>> The fix is twofold. >> >> 1. Stop the main container after an iteration is done. The main container is >> started with its runtime defined as 120 seconds, which means that each >> iteration takes 120 seconds. In reality, one iteration takes a few seconds >> while 115 seconds is spent waiting on the main container exiting. >> >> 2. Change the name of the main container to be unique per iteration. >> Containers are started with `--rm`, which means they are removed after >> exiting. However, the removal is done asynchronously _after_ the `stop` >> command has returned. This means that the second iteration may get an error >> if the same container name is used if the removal was not done before the >> container is started in the next iteration. >> >> On my machine, this cuts down the test runtime using Podman from 4m 13s to >> 17s. Using Docker, the runtime goes from 4m 15s to 41s. >> >> Podman only runs half the test cases (since JDK-8341310) which explain some >> of the difference. But there is also something strange going on in the >> Docker case; every `docker stop` call takes 10 seconds, and I have not been >> able to figure out what exactly causes it. >> >> Doing a manual `kill [container Java process PID]` gracefully terminates the >> Java process and container, but `docker stop` never does. Instead, it blocks >> for 10 seconds before abruptly killing the process using `SIGKILL`. I >> confirmed this with a simplified case and both >> `strace -e 'trace=!all' docker run --init eclipse-temurin:23 java ..` and >> `strace -e 'trace=!all' docker run eclipse-temurin:23 java ..`, no signals >> were ever visible when calling either `docker stop` or `docker kill`. >> >> https://www.docker.com/blog/docker-best-practices-choosing-between-run-cmd-and-entrypoint/ >> and "What is PID 1 and why does it matter?" talks about why >> [`--init`](https://docs.docker.com/reference/cli/docker/container/run/#init) >> is supposed to help. > > Sebastian Lövdahl has updated the pull request incrementally with one > additional commit since the last revision: > > Have EventGeneratorLoop end after a more predictable duration Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/21331#pullrequestreview-2455814864