Jason Gerlowski created SOLR-17515: -------------------------------------- Summary: Recovery fails in Solr 9.7.0 if basic-auth is enabled Key: SOLR-17515 URL: https://issues.apache.org/jira/browse/SOLR-17515 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 9.7 Reporter: Jason Gerlowski
Several reporters on the users@ list, recently shared a bug they noticed on upgrading to Solr 9.7. Replicas would try to recover, but fail with a NullPointerException: {code} 2024-09-18 09:36:31.238 ERROR (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] o.a.s.c.RecoveryStrategy Error while trying to recover. core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot invoke "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)" because "this.authenticationStore" is null at org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318) java.lang.NullPointerException: Cannot invoke "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)" because "this.authenticationStore" is null at org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318) ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97) ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85) ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093) ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062) ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907) ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633) ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:212) ~[metrics-core-4.2.26.jar:4.2.26] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449) ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 2024-09-03 15:05:20] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.base/java.lang.Thread.run(Thread.java:840) [?:?] 2024-09-18 09:36:31.238 ERROR (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0) 2024-09-18 09:36:31.238 INFO (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] o.a.s.c.RecoveryStrategy Wait [4] seconds before trying to recover again (attempt=1) {code} It turns out that the issue isn't specific to upgrading clusters: any 9.7.0 cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on during replica recovery. The result is that replicas will fail to recover, and sit marked as "recovering" indefinitely. The issue can be reproduced locally in a source-checkout using the following steps: {code} git checkout branch_9_7 ./gradlew clean assemble cd solr/packaging/build/solr-9.7.0-SNAPSHOT # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, "_default" configset bin/solr start -e cloud bin/solr post -c gettingstarted example/exampledocs/books.json # Stop the node containing the non-leader replica bin/solr stop -p <port> bin/solr post -c gettingstarted example/exampledocs/books.csv # Enable auth and trigger recovery by turning the node back on bin/solr auth enable -type basicAuth -credentials solr:solrRocks -blockUnknown true # This line will need tweaked based on which Solr node was previously stopped "bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z 127.0.0.1:9983 {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org