Hello, we’re using two SOLR servers (same hw, same version of solr and java, same solr config). The SOLR version is 9.3 and JVM is Adoptium JDK 17.0.8.1 on Linux. They both were running fine since a couple years (we upgraded from SOLR 8 to 9 with full reindexing some time ago).
Yesterday one of the server died with JVM crash with the following reason (I have the full JVM trace if needed). Once restarted the server ran fine and received data updates every 15 minutes, and responded to queries during the day. Today the server died around the same time with the same JVM trace. The time it died two times is early in the morning when we upload a lot of data. Then during the day the updates are less heavy in terms of size. One strange thing is that only one of the server died, the other one is running fine and it’s receiving the same data. Another thing to note is that in solrconfig we still had the “old” caches of SOLR 8 configured. Two days ago we changed the configuration to use CaffeineCache on one of the four cores (the biggest one). Not sure if it’s related but the time is suspicious… but why would it crash only on one of the servers since they’re both identical in configuration, version and hardware? Anyway I replaced solrconfig with the old configuration to see what happens tomorrow. I searched for bug fixes in Solr 9.4 and Lucene 9.8 and did not find anything that seems related to this. Any hint ? Thanks Current thread (0x00007f3f28002d10): ConcurrentGCThread "G1 Conc#4" [stack: 0x00007f3e8df6e000,0x00007f3e8e06e000] [id=25851] Stack: [0x00007f3e8df6e000,0x00007f3e8e06e000], sp=0x00007f3e8e06cc20, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x74efc8] void OopOopIterateDispatch<G1RebuildRemSetClosure>::Table::oop_oop_iterate<InstanceKlass, narrowOop>(G1RebuildRemSetClosure*, oopDesc*, Klass*)+0x98 V [libjvm.so+0x750c09] G1RebuildRemSetTask::G1RebuildRemSetHeapRegionClosure::do_heap_region(HeapRegion*)+0x579 V [libjvm.so+0x7dc2d6] HeapRegionManager::par_iterate(HeapRegionClosure*, HeapRegionClaimer*, unsigned int) const+0x96 V [libjvm.so+0x74d799] G1RebuildRemSetTask::work(unsigned int)+0x69 V [libjvm.so+0xf17f2f] GangWorker::loop()+0x5f V [libjvm.so+0xf17f8f] V [libjvm.so+0xe68d40] Thread::call_run()+0xc0 V [libjvm.so+0xc1e681] thread_native_entry(Thread*)+0xe1 siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000003 Register to memory mapping: RAX=0x00000006a0b79a98 is an oop: java.lang.String {0x00000006a0b79a98} - klass: 'java/lang/String' - string: "cmesmedia_44_2021_12" RBX=0x00000006e0382a18 is an oop: org.apache.lucene.document.StoredField {0x00000006e0382a18} - klass: 'org/apache/lucene/document/StoredField' - ---- fields (total size 4 words): - protected final 'type' 'Lorg/apache/lucene/index/IndexableFieldType;' @12 a 'org/apache/lucene/document/FieldType'{0x0000000691241d58} (d22483ab) - protected final 'name' 'Ljava/lang/String;' @16 "cmesmedia_44_2021_12"{0x00000006a0b79a98} (d416f353) - protected 'fieldsData' 'Ljava/lang/Object;' @20 a 'java/lang/Float'{0x00000006e0382a38} = 55.000000 (dc070547) - protected 'tokenStream' 'Lorg/apache/lucene/analysis/TokenStream;' @24 NULL (0) RCX=0x0000000000000003 is an unknown value RDX=0x00000000408fb0b0 is an unknown value RSP=0x00007f3e8e06cc20 points into unknown readable memory: 0x0000000000301c00 | 00 1c 30 00 00 00 00 00 RBP=0x00007f3e8e06cc60 points into unknown readable memory: 0x00007f3e8e06cd50 | 50 cd 06 8e 3e 7f 00 00 RSI=0x0000000000200000 is an unknown value RDI=0x00007f3d881d70d0 points into unknown readable memory: 0x00000000003803ff | ff 03 38 00 00 00 00 00 R8 =0x000000000000006d is an unknown value R9 =0x0000000000000044 is an unknown value R10=0x0000000000301c00 is an unknown value R11=0x0000000000000180 is an unknown value R12=0x00000006e0382a28 is pointing into object: org.apache.lucene.document.StoredField {0x00000006e0382a18} - klass: 'org/apache/lucene/document/StoredField' - ---- fields (total size 4 words): - protected final 'type' 'Lorg/apache/lucene/index/IndexableFieldType;' @12 a 'org/apache/lucene/document/FieldType'{0x0000000691241d58} (d22483ab) - protected final 'name' 'Ljava/lang/String;' @16 "cmesmedia_44_2021_12"{0x00000006a0b79a98} (d416f353) - protected 'fieldsData' 'Ljava/lang/Object;' @20 a 'java/lang/Float'{0x00000006e0382a38} = 55.000000 (dc070547) - protected 'tokenStream' 'Lorg/apache/lucene/analysis/TokenStream;' @24 NULL (0) R13=0x00000006e0382a34 is pointing into object: org.apache.lucene.document.StoredField {0x00000006e0382a18} - klass: 'org/apache/lucene/document/StoredField' - ---- fields (total size 4 words): - protected final 'type' 'Lorg/apache/lucene/index/IndexableFieldType;' @12 a 'org/apache/lucene/document/FieldType'{0x0000000691241d58} (d22483ab) - protected final 'name' 'Ljava/lang/String;' @16 "cmesmedia_44_2021_12"{0x00000006a0b79a98} (d416f353) - protected 'fieldsData' 'Ljava/lang/Object;' @20 a 'java/lang/Float'{0x00000006e0382a38} = 55.000000 (dc070547) - protected 'tokenStream' 'Lorg/apache/lucene/analysis/TokenStream;' @24 NULL (0) R14=0x00007f3e30786358 is pointing into metadata R15=0x00007f3e8e06cdc8 points into unknown readable memory: 0x00007f3f64941f68 | 68 1f 94 64 3f 7f 00 00 — Ing. Andrea Vettori Sistemi Informativi B2BIres s.r.l.