RE: a better option is to use the backup/restore functionality in the Collections API.
My impression was that there is no facility for _incremental_ backup and retore in the collections API: is there? To do backup and scp and restore of terabytes of data every few minutes does not sound practical. What we have for our databases (MSSQL, Sybase, MongoDB, Postgres, MySQL as well as Solr) is redundancy (for failover) in our main Data Center, and redundancy in read-only reasonably-concurrent copies in our second Data Center. For Solr8.11 (and earlier), we have had SolrCloud for redundancy in our main Data Center, and Leader/Follower replication to the read-only SolrClouds in the second Data Center. At one time, we were hoping that CDCR would work better: but we never managed to get CDCR to work reliably (and perhaps others also found it unreliable, leading to it being deprecated rather than being fixed). We have found that SolrCloud does not work reliably when spread across Data Centers, so Leader/Follower replication (formerly known as Master/Slave replication) has been the only way we have found to keep our (read-only) copies in the second Data Center only a few minutes behind the data in the main Data Center (the only way to provide latency low enough to be comparable to MSSQL, Sybase, MongoDB, Postgres and MySQL). My supervisor was asking for clarification whether you are implying that Leader/Follower replication is being deprecated. In my continuing attempts to resolve these issues, I have come across a related question. The error message about SolrAuthV2 prompted me to wonder about the V1 and V2 syntax options (such as shown at https://solr.apache.org/guide/solr/9_2/deployment-guide/collection-management.html#reload); and I was wondering whether Leader/Follower replication changed from using syntax V1 to using syntax V2, and if that might contribute to the breaking of permissions. As an experiment in our test environment, I setup a permission in security.json to allow RELOAD collection without a password. After confirming that my V2 syntax does work _with_ a password, I then attempted RELOAD collection without a password using both syntax V1 and syntax V2. Syntax V1 succeeded and syntax V2 failed. I have tried several permutations in security.json to allow RELOAD without password in syntax V2, but have not yet found a successful permutation. Are there any clarifications what security.json changes are needed for syntax V2? Can it be confirmed whether Leader/Follower replication is using V2 (in other words, whether that may be contributing to the permission problem)? [11:51 dbh19850s 1152]$ curl -X POST "http://`cat /tmp/pswd230808`@localhost:$p/api/collections/helpdocs" -H 'Content-Type: application/json' -d '{"reload":{}}' { "responseHeader":{ "status":0, "QTime":255}, "success":{ "nosqltest21.be-md:9852_solr":{ "responseHeader":{ "status":0, "QTime":176}}, "nosqltest22.be-md:9852_solr":{ "responseHeader":{ "status":0, "QTime":219}}}} [11:51 dbh19850s 1152]$ curl "http://localhost:$p/solr/admin/collections?action=RELOAD&name=helpdocs" { "responseHeader":{ "status":0, "QTime":237}, "success":{ "nosqltest21.be-md:9852_solr":{ "responseHeader":{ "status":0, "QTime":176}}, "nosqltest22.be-md:9852_solr":{ "responseHeader":{ "status":0, "QTime":216}}}} [11:51 dbh19850s 1153]$ curl -X POST "http://localhost:$p/api/collections/helpdocs" -H 'Content-Type: application/json' -d '{"reload":{}}' <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 401 Authentication failed, Response code: 401</title> </head> <body><h2>HTTP ERROR 401 Authentication failed, Response code: 401</h2> <table> <tr><th>URI:</th><td>/solr/____v2/collections/helpdocs</td></tr> <tr><th>STATUS:</th><td>401</td></tr> <tr><th>MESSAGE:</th><td>Authentication failed, Response code: 401</td></tr> <tr><th>SERVLET:</th><td>default</td></tr> </table> </body> </html> *[11:51 dbh19850s 1154]$ grep helpdocs 2023_08_10.request.log |tail -3 127.0.0.1 - - [10/Aug/2023:15:51:23 +0000] "POST /api/collections/helpdocs HTTP/1.1" 200 280 127.0.0.1 - - [10/Aug/2023:15:51:39 +0000] "GET /solr/admin/collections?action=RELOAD&name=helpdocs HTTP/1.1" 200 280 127.0.0.1 - - [10/Aug/2023:15:51:47 +0000] "POST /api/collections/helpdocs HTTP/1.1" 401 491 [11:51 dbh19850s 1155]$ less solr.log ... 2023-08-10 11:51:39.878 DEBUG (qtp1003693033-264) [ ] o.a.s.s.RuleBasedAuthorizationPluginBase Found perm [{ "name":"openreload8", "path":"/admin/collections", "index":9, "collection":null, "role":null, "params":{ "action":["REGEX:(?i)RELOAD"], "name":["REGEX:(?i)helpdocs"]}}] to govern resource [/admin/collections] 2023-08-10 11:51:39.878 DEBUG (qtp1003693033-264) [ ] o.a.s.s.RuleBasedAuthorizationPluginBase Governing permission [{ "name":"openreload8", "path":"/admin/collections", "index":9, "collection":null, "role":null, "params":{ "action":["REGEX:(?i)RELOAD"], "name":["REGEX:(?i)helpdocs"]}}] has no role; permitting access ... 2023-08-10 11:51:47.731 DEBUG (qtp1003693033-249) [ ] o.a.s.s.RuleBasedAuthorizationPluginBase Found perm [{ "name":"catch-all-nocollection", "path":"/*", "index":24, "collection":null, "role":"allgen"}] to govern resource [/____v2/collections/helpdocs] 2023-08-10 11:51:47.731 DEBUG (qtp1003693033-249) [ ] o.a.s.s.RuleBasedAuthorizationPluginBase Governing permission [{ "name":"catch-all-nocollection", "path":"/*", "index":24, "collection":null, "role":"allgen"}] has role, but request principal cannot be identified; forbidding access As a side note, in our experience, the only thing that has been cluttering up solr.log with attempts to connect without a password has been Leader/Follower replication (formerly known as Master/Slave replication). -----Original Message----- From: Shawn Heisey <apa...@elyograg.org> Sent: Saturday, August 5, 2023 4:24 PM To: users@solr.apache.org Subject: [EXTERNAL] Re: authentication for Leader/Follower replication On 7/6/23 14:00, Oakley, Craig (NIH/NLM/NCBI) [C] wrote: > We are having problems transitioning Leader/Follower replication to Solr9.2.1 > > In Solr8.5 and below, what was then called Master/Slave replication had the > annoying problem that, even though we specified httpBasicAuthUser and > httpBasicAuthPassword, it would always attempt to connect first without a > password before retrying with a password. This made solr.log noisy with lots > of unnecessary login failures: but at least it worked. In general, this is how basic auth via http works. The client first attempts the request without any credentials, and receives a 401 response. At this point, a browser would see the 401 response and display a popup asking for username/password. Then on future requests to that server in the current session, the browser sends the supplied credentials on every request to that server. If you are supplying credentials in the replication config, it should NOT follow that pattern ... the credentials should be always used. > Now when we are preparing to upgrade to Solr9.2.1, we are having issues with > the following: > 2023-07-06 15:46:53.315 INFO (indexFetcher-39-thread-1) [ ] > o.a.s.h.IndexFetcher Last replication failed, so I'll force replication > 2023-07-06 15:46:53.320 WARN (indexFetcher-39-thread-1) [ ] > o.a.s.h.IndexFetcher Leader at: > http://[REDACTED]/solr/sequence2_shard1_replica_n1 is not available. The info above is valid for Solr running in standalone mode. But those core names indicate that you are running in SolrCloud mode. In cloud mode, Solr handles all replication. Don't attempt to actually configure the replication handler in cloud mode ... Solr will handle it all for you, and will even automatically take care of authenticating the requests. You don't need to configure the replication handler at all. If you are using the replication handler as a "back door" to copy indexes to a separate Solr install, a better option is to use the backup/restore functionality in the Collections API. Thanks, Shawn CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and are confident the content is safe.