Peter Kroiss created SOLR-17306:
-----------------------------------

             Summary: Solr Repeater or Slave loses data after restart when 
replication is not enabled on leader
                 Key: SOLR-17306
                 URL: https://issues.apache.org/jira/browse/SOLR-17306
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 9.5.0, 9.6, 9.4, 9.3, 9.2
            Reporter: Peter Kroiss


We are testing Solr 9.6.2 in a leader - repeater - follower configuration. We 
have times where we write the leader heavily, in that time replication is 
disabled to save bandwidth.

In the time, when replication is disabled on leader, the repeater restarts for 
some reason, the repeater loses all documents and doesn't recover when the 
leader is opened for replication.

The documents are deleted but indexVersion and generation properties are set to 
the value of the leader, so the repeater or follower doesn't recover when the 
leader is opened for replication again.

It recovers only when there are commits on the leader after opening the 
replication.

Log:

2024-05-22 06:18:42.186 INFO  (qtp16373883-27-null-23) [c: s: r: x:mycore 
t:null-23] o.a.s.c.S.Request webapp=/solr path=/replication 
params=\{wt=json&command=details} status=0 QTime=10

2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore t:] 
o.a.s.h.IndexFetcher Leader's generation: 0

2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore t:] 
o.a.s.h.IndexFetcher Leader's version: 0

2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore t:] 
o.a.s.h.IndexFetcher Follower's generation: 2913

2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore t:] 
o.a.s.h.IndexFetcher Follower's version: 1716300697144

2024-05-22 06:18:46.195 INFO  (indexFetcher-43-thread-1) [c: s: r: x:mycore t:] 
o.a.s.h.IndexFetcher New index in Leader. Deleting mine...

 

 

We think the problem is in IndexFetcher

old: if (IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) {

forceReplication - will probably fix the problem

new : if (forceReplication && 
IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) {

 

 

 

 

When investigation the problem we also found some inconsistencies in the 
details request. There are two fragments leader. When the leader is closed for 
replication the property leader. replicationEnabled is set to true, the 
property follower. leaderDetails. Leader. replicationEnabled is correct.

 

Example

curl -s 
"https://solr9-repeater:8983/solr/mycore/replication?wt=json&command=details"; | 
jq  '.details | { indexSize: .indexSize, indexVersion: .indexVersion,

generation: .generation, indexPath: .indexPath,

leader: \{  replicableVersion: .leader.replicableVersion, replicableGeneration: 
.leader.replicableGeneration, replicationEnabled: .leader.replicationEnabled },

follower: { leaderDetails: { indexSize: .follower.leaderDetails.indexSize, 
generation: .follower.leaderDetails.generation,

 indexVersion: .follower.leaderDetails.indexVersion, indexPath: 
.follower.leaderDetails.indexPath,

leader: { replicableVersion:  .follower.leaderDetails.leader.replicableVersion 
, replicableGeneration:  .follower.leaderDetails.leader.replicableGeneration,

replicationEnabled:  .follower.leaderDetails.leader.replicationEnabled }

   }}

}'

 

{

  "indexSize": "10.34 GB",

  "indexVersion": 1716358708159,

  "generation": 2913,

  "indexPath": "/var/solr/data/mycore/data/index.20240522061946262",

  "leader": {

    "replicableVersion": 1716358708159,

    "replicableGeneration": 2913,

    "replicationEnabled": "true"

  },

  "follower": {

    "leaderDetails": {

      "indexSize": "10.34 GB",

      "generation": 2913,

      "indexVersion": 1716358708159,

      "indexPath": "/var/solr/data/mycore/data/restore.20240508131046932",

      "leader": {

        "replicableVersion": 1716358708159,

        "replicableGeneration": 2913,

        "replicationEnabled": "false"

      }

    }

  }

}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to