Hi.
We tried to upgrade a number of solar images from 8.11 to 9.0.0 this week and
ran into the same problems with the official docker image as well
(approximately 30 running pods in Kubernetes, crashing on average once an hour).
We even tried to build a docker image with another base image
(bellsoft/liberica-openjdk-debian:17), with the same result (even though the
actual error differed somewhat (optimize instead of build_loop_late_post_work)).
For now we ended up with building a docker image of solr 9.0.0 based on
openjdk:11-jre and it has been running several hours without crashes. Maybe
the official docker image should be reverted to this until the bug is fixed?
For reference: here are 2 crashes (hs_err_pidXX.log and replay_pidXX.log can be
provided on request):
Temurin-17.0.4.1+1:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fd81c7e6153, pid=14, tid=85
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.4.1+1 (17.0.4.1+1)
(build 17.0.4.1+1)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.4.1+1 (17.0.4.1+1, mixed mode,
sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0xacc153] PhaseIdealLoop::build_loop_late_post_work(Node*,
bool)+0x153
#
# Core dump will be written. Default location: /core.%e.14.%t
#
# An error report file with more information is saved as:
# /opt/solr-9.0.0/server/hs_err_pid14.log
{#
# Compiler replay data is saved as:
# /opt/solr-9.0.0/server/replay_pid14.log
#
# If you would like to submit a bug report, please visit:
# https://github.com/adoptium/adoptium-support/issues
#
/opt/scripts/startSolr.sh: line 54: 14 Aborted
/opt/solr/bin/solr -f -p ${solrport} -m ${memory}
-Dsolr.jetty.request.header.size=65535 -Dsolr.disable.allowUrls=true
-Dsolrindex=${profile} -Dlog4j2.formatMsgNoLookups=true -Dfinnapp=${finnapp}
-Dsolr.master.host=${masterhost} -Denable.slave=true -Denable.replication=true
-Dsolr.http1=true -Dsolr.disable.shardsWhitelist=true
-Dodin.job.host=solr-cl-job -Dodin.bap.host=solr-cl-bap
-Dodin.motor.host=solr-cl-motor -Dodin.estate.host=solr-cl-estate
bellsoft/liberica-openjdk-debian:17
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f96e2515e7f, pid=13, tid=76
#
# JRE version: OpenJDK Runtime Environment (17.0.4.1+1) (build 17.0.4.1+1-LTS)
# Java VM: OpenJDK 64-Bit Server VM (17.0.4.1+1-LTS, mixed mode, tiered,
compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0x515e7f] PhaseIdealLoop::optimize(PhaseIterGVN&,
LoopOptsMode)+0x13bf
#
# Core dump will be written. Default location: /core.%e.13.%t
#
# An error report file with more information is saved as:
# /opt/solr-9.0.0/server/hs_err_pid13.log
#
# Compiler replay data is saved as:
# /opt/solr-9.0.0/server/replay_pid13.log
#
# If you would like to submit a bug report, please visit:
# https://bell-sw.com/support
#
/opt/scripts/startSolr.sh: line 54: 13 Aborted
/opt/solr/bin/solr -f -p ${solrport} -m ${memory}
-Dsolr.jetty.request.header.size=65535 -Dsolr.disable.allowUrls=true
-Dsolrindex=${profile} -Dlog4j2.formatMsgNoLookups=true -Dfinnapp=${finnapp}
-Dsolr.master.host=${masterhost} -Denable.slave=true -Denable.replication=true
-Dsolr.http1=true -Dsolr.disable.shardsWhitelist=true
-Dodin.job.host=solr-cl-job -Dodin.bap.host=solr-cl-bap
-Dodin.motor.host=solr-cl-motor -Dodin.estate.host=solr-cl-estate
Regards,
Fredrik
--
Fredrik Rødland Cell: +47 99 21 98 17
Maisen Pedersens vei 1 Twitter: @fredrikr
NO-1363 Høvik, NORWAY flickr:
http://www.flickr.com/fmmr/
http://rodland.no about.me http://about.me/fmr