> On 22 Oct 2023, at 23:31, Shawn Heisey <apa...@elyograg.org.INVALID> wrote:
> 
> On 10/21/2023 2:31 AM, Ing. Andrea Vettori wrote:
>> Hello, we’re using two SOLR servers (same hw, same version of solr and java, 
>> same solr config). The SOLR version is 9.3 and JVM is Adoptium JDK 17.0.8.1 
>> on Linux.
>> They both were running fine since a couple years (we upgraded from SOLR 8 to 
>> 9 with full reindexing some time ago).
> 
> DEB and RPM-based distros typically make it very easy to install OpenJDK out 
> of the box, there is no need to download something like Adoptium:

After some years using Oracle JDK we started using AdoptJDK after trying a few 
VM distributions (not sure it is the correct name) and next Adopotium. I never 
had any issue. And as far as I know it’s the same code base of OpenJDK…

>> Yesterday one of the server died with JVM crash with the following reason (I 
>> have the full JVM trace if needed).
>> Once restarted the server ran fine and received data updates every 15 
>> minutes, and responded to queries during the day.
>> Today the server died around the same time with the same JVM trace.
>> The time it died two times is early in the morning when we upload a lot of 
>> data. Then during the day the updates are less heavy in terms of size.
>> One strange thing is that only one of the server died, the other one is 
>> running fine and it’s receiving the same data.
>> Another thing to note is that in solrconfig we still had the “old” caches of 
>> SOLR 8 configured. Two days ago we changed the configuration to use 
>> CaffeineCache on one of the four cores (the biggest one). Not sure if it’s 
>> related but the time is suspicious… but why would it crash only on one of 
>> the servers since they’re both identical in configuration, version and 
>> hardware? Anyway I replaced solrconfig with the old configuration to see 
>> what happens tomorrow.
> 
> Sig11 crashes that are confined to one system usually indicate bad hardware.  
> It could be a bad DIMM, a bad motherboard, or a bad CPU ... in that order, 
> with the DIMM being the most likely problem.

This could be the correct reason but why restoring the old configuration it’s 
now working well on both servers as it did in the previous two years ? Today 
it’s the second morning it has not “failed” so we have two day with crash and 
the new config and two days ok with the old config. I guess we must leave it 
running for a few days to see that will happen :)

> 
> Solr 9.3 includes the workaround for the caffeine-related Java crash, and the 
> version of Java that you are running doesn't have that bug anyway.

I think the bug has been fixed on java >= 20 only ?

Thanks
— 
Ing. Andrea Vettori
Sistemi Informativi
B2BIres s.r.l.

Reply via email to