[ https://issues.apache.org/jira/browse/SOLR-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated SOLR-17306: ---------------------------------- Labels: pull-request-available (was: ) > Solr Repeater or Slave loses data after restart when replication is not > enabled on leader > ----------------------------------------------------------------------------------------- > > Key: SOLR-17306 > URL: https://issues.apache.org/jira/browse/SOLR-17306 > Project: Solr > Issue Type: Bug > Affects Versions: 9.2, 9.3, 9.4, 9.5, 9.6 > Reporter: Peter Kroiss > Priority: Major > Labels: pull-request-available > Attachments: solr-replication-test.txt > > Time Spent: 10m > Remaining Estimate: 0h > > We are testing Solr 9.6.2 in a leader - repeater - follower configuration. We > have times where we write the leader heavily, in that time replication is > disabled to save bandwidth. > In the time, when replication is disabled on leader, the repeater restarts > for some reason, the repeater loses all documents and doesn't recover when > the leader is opened for replication. > The documents are deleted but indexVersion and generation properties are set > to the value of the leader, so the repeater or follower doesn't recover when > the leader is opened for replication again. > It recovers only when there are commits on the leader after opening the > replication. > Log: > 2024-05-22 06:18:42.186 INFO (qtp16373883-27-null-23) [c: s: r: x:mycore > t:null-23] o.a.s.c.S.Request webapp=/solr path=/replication > params=\{wt=json&command=details} status=0 QTime=10 > 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore > t:] o.a.s.h.IndexFetcher Leader's generation: 0 > 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore > t:] o.a.s.h.IndexFetcher Leader's version: 0 > 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore > t:] o.a.s.h.IndexFetcher Follower's generation: 2913 > 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore > t:] o.a.s.h.IndexFetcher Follower's version: 1716300697144 > 2024-05-22 06:18:46.195 INFO (indexFetcher-43-thread-1) [c: s: r: x:mycore > t:] o.a.s.h.IndexFetcher New index in Leader. Deleting mine... > > --> there is no new Index in Leader it is only closed for replication > > > We think the problem is in IndexFetcher > old: if (IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) { > forceReplication - will probably fix the problem > new : if (forceReplication && > IndexDeletionPolicyWrapper.getCommitTimestamp(commit) != 0L) { > > > > > When investigation the problem we also found some inconsistencies in the > details request. There are two fragments leader. When the leader is closed > for replication the property leader. replicationEnabled is set to true, the > property follower. leaderDetails. Leader. replicationEnabled is correct. > > Example > curl -s > "https://solr9-repeater:8983/solr/mycore/replication?wt=json&command=details" > | jq '.details | > { indexSize: .indexSize, indexVersion: .indexVersion, generation: > .generation, indexPath: .indexPath, leader: \\{ replicableVersion: > .leader.replicableVersion, replicableGeneration: > .leader.replicableGeneration, replicationEnabled: .leader.replicationEnabled } > , > follower: { leaderDetails: { indexSize: .follower.leaderDetails.indexSize, > generation: .follower.leaderDetails.generation, > indexVersion: .follower.leaderDetails.indexVersion, indexPath: > .follower.leaderDetails.indexPath, > leader: > { replicableVersion: .follower.leaderDetails.leader.replicableVersion , > replicableGeneration: .follower.leaderDetails.leader.replicableGeneration, > replicationEnabled: .follower.leaderDetails.leader.replicationEnabled } > }} > }' > > { > "indexSize": "10.34 GB", > "indexVersion": 1716358708159, > "generation": 2913, > "indexPath": "/var/solr/data/mycore/data/index.20240522061946262", > "leader": > { "replicableVersion": 1716358708159, "replicableGeneration": 2913, > "replicationEnabled": "true" } > , > "follower": { > "leaderDetails": { > "indexSize": "10.34 GB", > "generation": 2913, > "indexVersion": 1716358708159, > "indexPath": "/var/solr/data/mycore/data/restore.20240508131046932", > "leader": > { "replicableVersion": 1716358708159, "replicableGeneration": > 2913, "replicationEnabled": "false" } > } > } > } -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org