I managed to get it to start replicating the missing nodes, manually, using:

curl 
"http://192.168.1.4:8983/solr/admin/collections?action=ADDREPLICA&collection=mycollection&shard=shard10&node=192.168.1.11:8983_solr";

Is it normal to have to tell it manually which replicas to host after such a 
crash ?

Thanks

-----Original Message-----
From: Scott <qm...@top-consulting.net> 
Sent: Friday, December 10, 2021 2:38 PM
To: users@solr.apache.org
Subject: Solr Cloud Node re-join issue

Having a bit of  weird issue.

 

We run a 4 node Solr Cloud , version 8.6.2 and for the most part it's been 
going quite well for more than 2 years now. We have to restart them 
occasionally to free up ram but I guess that's normal.

 

Last night one of the nodes went into swap, used up all memory and crashed.
Somehow the way it crashed, it also removed all local cores/data. The cluster 
kept on chugging along which was fine, but now I can't get the crashed node to 
resync with the others.


I restart it as usual, there's no error that I can see in the logs under DEBUG 
level. Under the Web GUI the node is there but with 0 bytes and it never starts 
replicating.

 

I tried recreating the solr home directory but it didn't do anything.

 

The docs say to just start the node in cloud mode, which I've been doing:
https://solr.apache.org/guide/7_0/getting-started-with-solrcloud.html

 

Any tips ?

Thanks!



This is a private message



This is a private message

Reply via email to