Hi all, On CI, some pulsar-metadata tests fail frequently with JVM crashes.
# # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x00007f10670d5113, pid=3902, tid=3941 # # JRE version: OpenJDK Runtime Environment Temurin-17.0.8.1+1 (17.0.8.1+1) (build 17.0.8.1+1) # Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.8.1+1 (17.0.8.1+1, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-amd64) # Problematic frame: # V [libjvm.so+0xad5113] PhaseIdealLoop::build_loop_late_post_work(Node*, bool)+0xe3 It appears that this is most likely a JVM bug JDK-8314024 [1] that is fixed in Java 17.0.10 (Release date Jan 26, 2024) and fixed in Java 21.0.1 which was released about 2 weeks ago. One possible workaround would be to run Pulsar on Java 21.0.1 . All tests pass in the master branch on Java 21, and therefore it is probable that 3.0.x or 3.1.x might be directly compatible with Java 21 at runtime. The master branch contains support for developing and running tests with Java 21 and that required multiple library updates. This is the Pulsar issue: https://github.com/apache/pulsar/issues/19307 Sample crash reports: https://gist.github.com/lhotari/53b72683ad4f339dfbcfd8b9b97062b9 It appears that there was an earlier bug with a similar stack trace in the crash report. This JVM bug was JDK-8285835 [2] which was fixed in 17.0.7 . Since the crashes continue, it didn't fix the issue and we now expect that JDK-8314024 [1] is the fix for the issue that we are facing. Regards, -Lari 1 - https://bugs.openjdk.org/browse/JDK-8314024 2 - https://bugs.openjdk.org/browse/JDK-8285835