This has happened yet again.

Does anyone yet have any input on the idea of using the Leader's collection 
name in Leader/Follower replication (or pre-Solr8.7 Master/Slave replication), 
rather than the core name?

-----Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] <craig.oak...@nih.gov.INVALID> 
Sent: Thursday, June 3, 2021 10:30 AM
To: users@solr.apache.org
Subject: RE: Cores renamed

As a potential solution, I was wondering about implementing Master/Slave 
replication using the collection name of the Master rather than the core name. 
My initial experiment with this in a test environment seemed to work. Does 
anyone have any input on the idea of using the Master's collection name in 
Master/Slave replication, rather than the core name?

-----Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] <craig.oak...@nih.gov.INVALID> 
Sent: Wednesday, June 02, 2021 5:46 PM
To: users@solr.apache.org
Subject: RE: Cores renamed

It happened again this morning.

Attached is an excerpt from solr.log (with port #s & IP addresses redacted) and 
below is the current CLUSTERSTATUS (with port #s redacted)

Is there yet any explanation?

{
  "responseHeader":{
    "status":0,
    "QTime":10},
  "cluster":{
    "collections":{
      "ipg_report_large":{
        "pullReplicas":"0",
        "replicationFactor":"1",
        "shards":{"shard1":{
            "range":"80000000-7fffffff",
            "state":"active",
            "replicas":{
              "core_node8":{
                "core":"ipg_report_large_shard1_replica_n7",
                "base_url":"http://solrdbprod26.be-md:####/solr";,
                "node_name":"solrdbprod26.be-md:####_solr",
                "state":"active",
                "type":"NRT",
                "force_set_state":"false",
                "leader":"true"},
              "core_node10":{
                "core":"ipg_report_large_shard1_replica_n9",
                "base_url":"http://solrdbprod25.be-md:####/solr";,
                "node_name":"solrdbprod25.be-md:####_solr",
                "state":"active",
                "type":"NRT",
                "force_set_state":"false"}}}},
        "router":{"name":"compositeId"},
        "maxShardsPerNode":"1",
        "autoAddReplicas":"false",
        "nrtReplicas":"1",
        "tlogReplicas":"0",
        "znodeVersion":741,
        "configName":"ipg_report_large"}},
    "live_nodes":["solrdbprod26.be-md:####_solr",
      "solrdbprod25.be-md:####_solr"]}}

-----Original Message-----
From: Oakley, Craig (NIH/NLM/NCBI) [C] <craig.oak...@nih.gov.INVALID> 
Sent: Monday, May 17, 2021 5:01 PM
To: users@solr.apache.org
Subject: RE: Cores renamed

The entire directory for the old core gets removed

Here is CLUSTERSTATUS (again with port numbers redacted). I ran CLUSTERSTATUS 
on both nodes, and the only difference was QTime (that is, there was no real 
difference):

{
  "responseHeader":{
    "status":0,
    "QTime":5},
  "cluster":{
    "collections":{
      "ipg_report_large":{
        "pullReplicas":"0",
        "replicationFactor":"1",
        "shards":{"shard1":{
            "range":"80000000-7fffffff",
            "state":"active",
            "replicas":{
              "core_node4":{
                "core":"ipg_report_large_shard1_replica_n3",
                "base_url":"http://solrdbprod26.be-md:####/solr";,
                "node_name":"solrdbprod26.be-md:####_solr",
                "state":"active",
                "type":"NRT",
                "force_set_state":"false"},
              "core_node6":{
                "core":"ipg_report_large_shard1_replica_n5",
                "base_url":"http://solrdbprod25.be-md:####/solr";,
                "node_name":"solrdbprod25.be-md:####_solr",
                "state":"active",
                "type":"NRT",
                "force_set_state":"false",
                "leader":"true"}}}},
        "router":{"name":"compositeId"},
        "maxShardsPerNode":"1",
        "autoAddReplicas":"false",
        "nrtReplicas":"1",
        "tlogReplicas":"0",
        "znodeVersion":710,
        "configName":"ipg_report_large"}},
    "live_nodes":["solrdbprod26.be-md:####_solr",
      "solrdbprod25.be-md:####_solr"]}}

-----Original Message-----
From: matthew sporleder <msporle...@gmail.com> 
Sent: Monday, May 17, 2021 4:34 PM
To: users@solr.apache.org
Subject: Re: Cores renamed

Can you verify all of your zkHost connection params across the entire
cluster, and share the replicationFactor, autoAddReplicas, etc for the
collection?

My theory is that you have two zookeeper configs conflicting as master
elections happens, causing new replicas to get created on-the-fly.

Also -- do these cores get deleted from the filesystem or left around?

On Mon, May 17, 2021 at 4:11 PM Oakley, Craig (NIH/NLM/NCBI) [C]
<craig.oak...@nih.gov.invalid> wrote:
>
> > What does the core renames itself to, that would probably be the biggest 
> > hint.
>
> At 4:01pm 1/14/21, Solr decided on its own to drop the core 
> ipg_report_large_shard1_replica_n1 and to create the core 
> ipg_report_large_shard1_replica_n7 in its place
>
> At 4:33am 1/16/21, Solr decided on its own to drop the core 
> ipg_report_large_shard1_replica_n5 (on another node of the same SolrCloud) 
> and to create the core ipg_report_large_shard1_replica_n9 in its place
>
> At about 4:10pm 1/26/21, Solr decided on its own to drop this core 
> ipg_report_large_shard1_replica_n9 and to create the core 
> ipg_report_large_shard1_replica_n13 in its place
>
> In March, we created a new SolrCloud for the same collection, and reloaded 
> the data
>
> At 7:59am 5/12/21, Solr decided on its own to drop the core 
> ipg_report_large_shard1_replica_n1 and to create the core 
> ipg_report_large_shard1_replica_n5 in its place
>
> I am attaching an excerpt from solr.log for the most recent problem (with IP 
> addresses and port numbers redacted)
>
> Please not that Master/Slave replication breaks when a core is renamed, so 
> this can be a major problem
>
>
> Any ideas?
>
> -----Original Message-----
> From: Alexandre Rafalovitch <arafa...@gmail.com>
> Sent: Wednesday, May 12, 2021 2:10 PM
> To: users@solr.apache.org
> Subject: Re: Cores renamed
>
> This is truly a shot in the dark, but is it possible you have
> something in core.properties file (which is where the core name is for
> non-Cloud setup)?
>
> What does the core renames itself to, that would probably be the biggest hint.
>
> Regards,
>    Alex.
>
> On Wed, 12 May 2021 at 14:00, Oakley, Craig (NIH/NLM/NCBI) [C]
> <craig.oak...@nih.gov.invalid> wrote:
> >
> > This phenomenon has happened again (this time without any REQUESTRECOVERY)
> >
> > Does anyone yet have any explanation of this?
> >
> > -----Original Message-----
> > From: Oakley, Craig (NIH/NLM/NCBI) [C] <craig.oak...@nih.gov.INVALID>
> > Sent: Thursday, January 28, 2021 10:57 AM
> > To: solr-u...@lucene.apache.org
> > Subject: Cores renamed
> >
> > We recently have had a few occasions when cores for one specific collection 
> > were renamed (or more likely dropped and recreated, and thus ended up with 
> > a different core name).
> >
> > Is this a known phenomenon? Is there any explanation?
> >
> > It may be relevant that we just recently started running this SolrCloud on 
> > version 8.5.2, although the collection was created under Solr7.4. Also, 
> > this collection seems to experience some heavy updates such that the 
> > non-Leader replica has trouble keeping up. One of these renames occurred at 
> > 4:33am, so I highly suspect that the rename (or drop and recreate) was done 
> > by some internal Solr thread rather than by any of my coworkers. One other 
> > potential clue is that I can see that 
> > /solr/admin/cores?action=REQUESTRECOVERY was usually run on the new core a 
> > moment after it was created.
> >
> > Does anyone have any insights?

Reply via email to