On 10/21/23 03:31, Ing. Andrea Vettori wrote:

They both were running fine since a couple years (we upgraded from SOLR 8 to 9 
with full reindexing some time ago).

Yesterday one of the server died with JVM crash with the following reason (I 
have the full JVM trace if needed).
Once restarted the server ran fine and received data updates every 15 minutes, 
and responded to queries during the day.
Today the server died around the same time with the same JVM trace.
...
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 
0x0000000000000003

SIG11 on a previously-stable program is usually RAM.

https://tldp.org/FAQ/sig11/html/index.html

Make a memtest86 thumbdrive, boot the offending server off it, and let it run for a few days.

Check fans: they don't last forever, SIMMS (or possibly CPU) may be overheating from inadequate cooling.

If RAM checks out, install stress(-ng) and try pushing the CPU, and/or both RAM+CPU for a few days to see if that crashes it.

Force-fsck the drive where the index lives, just to cover all bases

Dima

Reply via email to