Re: [PR] Cleanup JDK version related logic in scripts [solr]

2024-10-25 Thread via GitHub


epugh commented on code in PR #2792:
URL: https://github.com/apache/solr/pull/2792#discussion_r1815533533


##
solr/bin/solr:
##
@@ -1054,31 +1054,13 @@ fi
 
 # Establish default GC logging opts if no env var set (otherwise init to 
sensible default)
 if [ -z "${GC_LOG_OPTS}" ]; then
-  if [[ "$JAVA_VER_NUM" -lt "9" ]] ; then
-GC_LOG_OPTS=('-verbose:gc' '-XX:+PrintHeapAtGC' '-XX:+PrintGCDetails' \
- '-XX:+PrintGCDateStamps' '-XX:+PrintGCTimeStamps' 
'-XX:+PrintTenuringDistribution' \
- '-XX:+PrintGCApplicationStoppedTime')
-  else
-GC_LOG_OPTS=('-Xlog:gc*')
-  fi
-else
-  # TODO: Should probably not overload GC_LOG_OPTS as both string and array, 
but leaving it be for now
-  # shellcheck disable=SC2128
-  GC_LOG_OPTS=($GC_LOG_OPTS)
+  GC_LOG_OPTS=('-Xlog:gc*')
 fi
 
 # if verbose gc logging enabled, setup the location of the log file and 
rotation
 if [ "${#GC_LOG_OPTS[@]}" -gt 0 ]; then
-  if [[ "$JAVA_VER_NUM" -lt "9" ]] || [ "$JAVA_VENDOR" == "OpenJ9" ]; then
-gc_log_flag="-Xloggc"
-if [ "$JAVA_VENDOR" == "OpenJ9" ]; then
-  gc_log_flag="-Xverbosegclog"
-fi
-if [ -z ${JAVA8_GC_LOG_FILE_OPTS+x} ]; then

Review Comment:
   what does line 1077 mean?  Is it possible that it saying that if this 
variable "JAVA8_GC_LOG_FILE_OPTS" isn't set then we do the the GC_LOG_OPTS?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17161) Separate out a solrj-jetty artifact (10.0)

2024-10-25 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892565#comment-17892565
 ] 

Jason Gerlowski commented on SOLR-17161:


I guess I have a potential concern about moving the Jetty based clients into 
their own artifact.  Sorry to bring it to the table so late.  (To be clear - 
it's a "concern" and "request for info", and not a veto or anything like that.)

In short: it goes without saying how important defaults are in software, and 
making the JDK-based client the only client available in 'solrj-core' will make 
it the "effective default" for a lot of folks.  Most users will start by just 
grabbing 'solrj-core', and then their IDE's autocomplete will suggest 
HttpJdkSolrClient (and only HttpJdkSolrClient).  That's a big deal!  And I 
worry about making that sort of change without a discussion about whether it's 
our best option, holistically.

The JDK-based client is a clear winner on some concerns, e.g. dependency 
footprint.  But that's not the only concern users are likely to have: is there 
any difference in the perf characteristics of the two underlying HttpClients?  
what sort of hooks does each offer into the request/response or client 
lifecycle? do the clients differ in how much they let users customize threading 
or connection-pooling behavior? is there a popularity gap, or are folks already 
pretty familiar with one HttpClient in particular?  any logging or tracing 
differences?

Has this sort of holistic discussion happened somewhere that I just missed?  If 
not, maybe we could have that here?

> Separate out a solrj-jetty artifact (10.0)
> --
>
> Key: SOLR-17161
> URL: https://issues.apache.org/jira/browse/SOLR-17161
> Project: Solr
>  Issue Type: Sub-task
>  Components: clients - java
>Reporter: Jan Høydahl
>Priority: Blocker
> Fix For: main (10.0)
>
>
> Given we have a native JDK based client in SOLR-599, we can separate out all 
> {{Http2SolrClient}} and freiends with their jetty-client dependencies into a 
> separate artifact {{{}solrj-jetty{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17504: CoreContainer calls UpdateHandler.commit. [solr]

2024-10-25 Thread via GitHub


dsmiley commented on code in PR #2786:
URL: https://github.com/apache/solr/pull/2786#discussion_r1815544031


##
solr/core/src/java/org/apache/solr/core/CoreContainer.java:
##
@@ -2061,13 +2066,16 @@ public void reload(String name, UUID coreId) {
   RefCounted iwRef = 
core.getSolrCoreState().getIndexWriter(null);
   if (iwRef != null) {
 IndexWriter iw = iwRef.get();
-// switch old core to readOnly
-core.readOnly = true;

Review Comment:
   as an aside, I don't like that CoreContainer is doing SolrCore internal 
manipulations... like this should be a method on SolrCore like core.commit()



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-10-25 Thread Sanjay Dutt (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892684#comment-17892684
 ] 

Sanjay Dutt commented on SOLR-17497:


{code:java}
@Test
public void test(){
 ExecutorService fsyncService =   
ExecutorUtil.newMDCAwareSingleThreadExecutor(new 
SolrNamedThreadFactory("fsyncService"));
 try {
   fsyncService.submit(() -> {
throw new AlreadyClosedException("Directory is already closed!");
  });
 } catch (Exception e) {
   System.out.println(e);
 } finally {
   fsyncService.shutdown();
 }
}{code}
In [https://github.com/apache/solr/pull/2707], we have basically replaced 
ExecutorService#submit with ExecutorService#execute, and now execute throws 
exception rather than suppressing it. Same can be tested with the above example 
where running it won't fail, on the other hand If you use execute it will fail 
immediately.  

> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Sanjay Dutt
>Priority: Major
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through one of the test case execution step.
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
>  
> Test flow
> 1. Create a collection with 1 NRT and 1 PULL replica
> 2. waitForState
> 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all*
> 4. Index another document.
> 5. waitForNumDocsInAllActiveReplicas(1);
> 6. Stop Pull replica
> 7. Index another document
> 8. waitForNumDocsInAllActiveReplicas(2);
> 9. Start Pull Replica
> 10. waitForState
> 11. waitForNumDocsInAllActiveReplicas(2);
>  
> As per the logs the whole sequence executed successfully. Here is the link to 
> the logs: 
> [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output]
>  (link may stop working in the future)
>  
> Last step where they are making sure that all the active replicas should have 
> two documents each has logged a info which is another proof that it completed 
> successfully. 
>  
> {code:java}
> 616575 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 
> (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/)
>  has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) 
> [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 
> r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 
> t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select 
> params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=

[jira] [Updated] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2024-10-25 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-17515:
---
Description: 
Several reporters on the users@ list, recently shared a bug they noticed on 
upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
NullPointerException:

{code}
2024-09-18 09:36:31.238 ERROR 
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Error while trying to recover. 
core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
java.lang.NullPointerException: Cannot invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
...
{code}

It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
during replica recovery*.  The result is that replicas will fail to recover, 
and sit marked as "recovering" indefinitely.

The issue can be reproduced locally in a source-checkout using the following 
steps:

{code}
git checkout branch_9_7
./gradlew clean assemble
cd solr/packaging/build/solr-9.7.0-SNAPSHOT

# At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
"_default" configset
bin/solr start -e cloud

bin/solr post -c gettingstarted example/exampledocs/books.json
# Stop the node containing the non-leader replica
bin/solr stop -p 
bin/solr post -c gettingstarted example/exampledocs/books.csv

# Enable auth and trigger recovery by turning the node back on
bin/solr auth enable -type basicAuth -credentials solr:solrRocks -blockUnknown 
true
# This line will need tweaked based on which Solr node was previously stopped
"bin/solr" start --cloud -p  -s "example/cloud//solr" -z 
127.0.0.1:9983
{code}

  was:
Several reporters on the users@ list, recently shared a bug they noticed on 
upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
NullPointerException:

{code}
2024-09-18 09:36:31.238 ERROR 
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Error while trying to recover. 
core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.

[jira] [Commented] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2024-10-25 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892831#comment-17892831
 ] 

Jason Gerlowski commented on SOLR-17515:


[~sanjaydutt] pointed me at the likely culprit:

At various points, the RecoveryStrategy code bootstraps a new Http2SolrClient 
based on an existing one.  But this bootstrapping overlooks the 
'authenticationStore' object from the existing client, which results in a NPE 
when code later on expects it to be set.  The place to fix this is _probably_ 
in the "withHttpClient" builder method used by RecoveryStrategy (see the 
calling snippet below):

{code:title=RecoveryStrategy#recoverySolrClientBuilder}
  private Http2SolrClient.Builder recoverySolrClientBuilder(String baseUrl, 
String leaderCoreName) {
final UpdateShardHandlerConfig cfg = 
cc.getConfig().getUpdateShardHandlerConfig();
return new Http2SolrClient.Builder(baseUrl)
.withDefaultCollection(leaderCoreName)
.withHttpClient(cc.getUpdateShardHandler().getRecoveryOnlyHttpClient());
  }
{code}

> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
> during replica recovery*.  The result is that replicas will fail to recover, 
> and sit marked as "recovering" indefinitely.
> T

[PR] SOLR-10654: Prometheus regex cloud pattern fix for core names [solr]

2024-10-25 Thread via GitHub


mlbiscoc opened a new pull request, #2795:
URL: https://github.com/apache/solr/pull/2795

   https://issues.apache.org/jira/browse/SOLR-10654
   
   # Description
   
   The regex pattern for Solr cloud mode assumed all core names ended with a 
`replica_n[0-9]+` which is incorrect. Some core names should be able to have 
any single character letter before the numbers.
   
   # Solution
   
   Change regex pattern to with `.` instead of `n` to match any single character
   
   # Tests
   
   `testCloudCorePattern` and `testBadCloudCorePattern` to test the regex cloud 
pattern.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended, not available for 
branches on forks living under an organisation)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [x] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] Bump up Java version to 21 [solr]

2024-10-25 Thread via GitHub


iamsanjay merged PR #2682:
URL: https://github.com/apache/solr/pull/2682


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17511) CLI: Resole -i conflicts (async-id, cluster-id)

2024-10-25 Thread Eric Pugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892748#comment-17892748
 ] 

Eric Pugh commented on SOLR-17511:
--

 think instead of ExportTool you mean SolrExporter ;)

> CLI: Resole -i conflicts (async-id, cluster-id)
> ---
>
> Key: SOLR-17511
> URL: https://issues.apache.org/jira/browse/SOLR-17511
> Project: Solr
>  Issue Type: Sub-task
>  Components: cli
>Affects Versions: 9.7, 9.6.1
>Reporter: Christos Malliaridis
>Priority: Minor
>  Labels: cli
>
> The CLI flag {{\-i}} is currently used in two options:
> - for {{async-id}} in SnapshotExportTool for specifying an asynchronous 
> request identifier
> - for {{cluster-id}} in ExportTool for specifying a unique cluster identifier
> Since both short options are not obvious and the letter {{i}} may be used in 
> another context in the future, we should reserve it and deprecate (9.8) / 
> remove (10.0) it from the above options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17511) CLI: Resole -i conflicts (async-id, cluster-id)

2024-10-25 Thread Christos Malliaridis (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christos Malliaridis updated SOLR-17511:

Description: 
The CLI flag {{\-i}} is currently used in two options:
- for {{async-id}} in SnapshotExportTool for specifying an asynchronous request 
identifier
- for {{cluster-id}} in SolrExporter for specifying a unique cluster identifier

Since both short options are not obvious and the letter {{i}} may be used in 
another context in the future, we should reserve it and deprecate (9.8) / 
remove (10.0) it from the above options.


  was:
The CLI flag {{\-i}} is currently used in two options:
- for {{async-id}} in SnapshotExportTool for specifying an asynchronous request 
identifier
- for {{cluster-id}} in ExportTool for specifying a unique cluster identifier

Since both short options are not obvious and the letter {{i}} may be used in 
another context in the future, we should reserve it and deprecate (9.8) / 
remove (10.0) it from the above options.



> CLI: Resole -i conflicts (async-id, cluster-id)
> ---
>
> Key: SOLR-17511
> URL: https://issues.apache.org/jira/browse/SOLR-17511
> Project: Solr
>  Issue Type: Sub-task
>  Components: cli
>Affects Versions: 9.7, 9.6.1
>Reporter: Christos Malliaridis
>Priority: Minor
>  Labels: cli
>
> The CLI flag {{\-i}} is currently used in two options:
> - for {{async-id}} in SnapshotExportTool for specifying an asynchronous 
> request identifier
> - for {{cluster-id}} in SolrExporter for specifying a unique cluster 
> identifier
> Since both short options are not obvious and the letter {{i}} may be used in 
> another context in the future, we should reserve it and deprecate (9.8) / 
> remove (10.0) it from the above options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17511: Deprecate -i CLI usages [solr]

2024-10-25 Thread via GitHub


epugh commented on PR #2794:
URL: https://github.com/apache/solr/pull/2794#issuecomment-2437644101

   Thanks for the review!   I think if the tests pass this is ready for merging!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-17511: Deprecate -i CLI usages [solr]

2024-10-25 Thread via GitHub


malliaridis commented on code in PR #2794:
URL: https://github.com/apache/solr/pull/2794#discussion_r1816622405


##
solr/prometheus-exporter/src/java/org/apache/solr/prometheus/scraper/SolrScraper.java:
##
@@ -184,7 +184,7 @@ protected MetricSamples request(SolrClient client, 
MetricsQuery query) throws IO
 labelValues.add(zkHostLabelValue);
   }
 
-  // Add the unique cluster ID, either as specified on cmdline -i or 
baseUrl/zkHost
+  // Add the unique cluster ID, either as specified on cmdline 
--cluster-id or baseUrl/zkHost

Review Comment:
   ```suggestion
 // Add the unique cluster ID, either as specified on cmdline 
--cluster-id or
 // baseUrl/zkHost
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Comment Edited] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-10-25 Thread Sanjay Dutt (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892798#comment-17892798
 ] 

Sanjay Dutt edited comment on SOLR-17497 at 10/25/24 1:29 PM:
--

Yeah you are right, I have to look into this subject (execute vs submit) more 
and how this whole things works.

+1 to this.
{quote}For the case here in IndexFetcher, as long as the exception that is 
thrown is logged, I think we should suppress its propagation further.
{quote}
I was also looking why we are getting "User aborted replication" messages. 
RecoveryStrategy in case of PULL replicas cancel the replication. Here is the 
explanation from the old JIRA.

https://issues.apache.org/jira/browse/SOLR-10233
{quote}
h3. Passive replica dies (or is unreachable)

Replica won’t be query-able. On restart, replica will recover from the leader, 
following the same flow as _realtime_ replicas: set state to DOWN, then 
RECOVERING, and finally ACTIVE. _Passive_ replicas will use a different 
{{RecoveryStrategy}} implementation, that omits *preparerecovery,* and peer 
sync attempt, it will jump to replication . If the leader didn't change, or if 
the other replicas are of type “append”, replication should be incremental. 
Once the first replication is done, passive replica will declare itself active 
and start serving traffic.
{quote}
*RecoveryStrategy.java*
{noformat}
log.info("Stopping background replicate from leader process");
zkController.stopReplicationFromLeader(coreName);
replicate(zkController.getNodeName(), core, leaderprops);{noformat}
My own theory:

1. RecoveryStrategy cancel replication.

2. FileFetcher#fetchPackets throws ReplicationHandlerException
{code:java}
if (stop) {
  stop = false;
  aborted = true;
  throw new ReplicationHandlerException("User aborted replication");
}{code}
3. FileFetcher#fetch runs finally block where the sync is executed in async
{code:java}
fsyncService.submit(() -> {
  try {
file.sync();
  } catch (IOException e) {
fsyncException = e;
  } catch (InterruptedException e) {
throw new RuntimeException(e);
  }
});{code}
4. At the same time the control gets back to fetchLatestIndex that performs 
cleanup and closed the directory
{code:java}
finally {
  if (!cleanupDone) {
  cleanup(solrCore, tmpIndexDir, indexDir, deleteTmpIdxDir, tmpTlogDir, 
successfulInstall);
 }
}{code}
And basically there is race condition between step 3 and 4 that's what I 
believe. Not able to reproduce on my system yet.

 


was (Author: JIRAUSER305513):
Yeah you are right, I have to look into this subject (execute vs submit) more 
and how this whole things works.

+1 to this.
{quote}For the case here in IndexFetcher, as long as the exception that is 
thrown is logged, I think we should suppress its propagation further.


{quote}
I was also looking why we are getting "User aborted replication" messages. 
RecoveryStrategy in case of PULL replicas cancel the replication. Here is the 
explanation from the old JIRA.

https://issues.apache.org/jira/browse/SOLR-10233
{quote}
h3. Passive replica dies (or is unreachable)

Replica won’t be query-able. On restart, replica will recover from the leader, 
following the same flow as _realtime_ replicas: set state to DOWN, then 
RECOVERING, and finally ACTIVE. _Passive_ replicas will use a different 
{{RecoveryStrategy}} implementation, that omits *preparerecovery,* and peer 
sync attempt, it will jump to replication . If the leader didn't change, or if 
the other replicas are of type “append”, replication should be incremental. 
Once the first replication is done, passive replica will declare itself active 
and start serving traffic.
{quote}
*RecoveryStrategy.java*
{noformat}
log.info("Stopping background replicate from leader process");
zkController.stopReplicationFromLeader(coreName);
replicate(zkController.getNodeName(), core, leaderprops);{noformat}
My own theory:
 # RecoveryStrategy cancel replication.
 # FileFetcher#fetchPackets throws ReplicationHandlerException

 
{code:java}
if (stop) {
  stop = false;
  aborted = true;
  throw new ReplicationHandlerException("User aborted replication");
}{code}

 # FileFetcher#fetch runs finally block where the sync is executed in async

 
{code:java}
fsyncService.submit(() -> {
  try {
file.sync();
  } catch (IOException e) {
fsyncException = e;
  } catch (InterruptedException e) {
throw new RuntimeException(e);
  }
});{code}

 # At the same time the control gets back to fetchLatestIndex that performs 
cleanup and closed the directory

{code:java}
finally {
  if (!cleanupDone) {
  cleanup(solrCore, tmpIndexDir, indexDir, deleteTmpIdxDir, tmpTlogDir, 
successfulInstall);
 }
}{code}
And basically there is race condition between step 3 and 4 that's what I 
believe. Not able to reproduce on my system yet.

 

> Pull replicas throws AlreadyClosedException  
> -
>
>

[jira] [Commented] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2024-10-25 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892821#comment-17892821
 ] 

Jason Gerlowski commented on SOLR-17515:


Credit and thanks to Patrik Peng and Endika Posadas for [reporting this on the 
users list|https://lists.apache.org/thread/jhs7lkg942nxg2hlb879k6tc832yhm06]!

This seems like a pretty serious bug: perhaps worth a 9.7.1?

> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:212)
>  ~[metrics-core-4.2.26.jar:4.2.26]
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>  ~[?:?]
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) 
> ~[?:?]
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>  ~[?:?]
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>  ~[?:?]
>   at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8

[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-10-25 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892780#comment-17892780
 ] 

David Smiley commented on SOLR-17497:
-

bq.  execute throws exception rather than suppressing it

Maybe I'm nitpicking but execute() definitely isn't throwing the exception; 
it's impossible that it could even do so since the Runnable that does throw an 
exception happens asynchronously after execute() returns.  The change from 
before is that the thrown exception (from the Runnable) is no longer captured 
into a Future; it bubbles up to the Thread uncaughtExceptionHandler where our 
test infrastructure notices it and reports it via 
com.carrotsearch.randomizedtesting.UncaughtExceptionError.  CC [~andreybozhko].

For the case here in IndexFetcher, as long as the exception that is thrown is 
logged, I think we should suppress its propagation further.

> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Sanjay Dutt
>Priority: Major
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through one of the test case execution step.
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
>  
> Test flow
> 1. Create a collection with 1 NRT and 1 PULL replica
> 2. waitForState
> 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all*
> 4. Index another document.
> 5. waitForNumDocsInAllActiveReplicas(1);
> 6. Stop Pull replica
> 7. Index another document
> 8. waitForNumDocsInAllActiveReplicas(2);
> 9. Start Pull Replica
> 10. waitForState
> 11. waitForNumDocsInAllActiveReplicas(2);
>  
> As per the logs the whole sequence executed successfully. Here is the link to 
> the logs: 
> [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output]
>  (link may stop working in the future)
>  
> Last step where they are making sure that all the active replicas should have 
> two documents each has logged a info which is another proof that it completed 
> successfully. 
>  
> {code:java}
> 616575 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 
> (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/)
>  has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) 
> [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 
> r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 
> t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select 
> params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=

[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-10-25 Thread Sanjay Dutt (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892798#comment-17892798
 ] 

Sanjay Dutt commented on SOLR-17497:


Yeah you are right, I have to look into this subject (execute vs submit) more 
and how this whole things works.

+1 to this.
{quote}For the case here in IndexFetcher, as long as the exception that is 
thrown is logged, I think we should suppress its propagation further.


{quote}
I was also looking why we are getting "User aborted replication" messages. 
RecoveryStrategy in case of PULL replicas cancel the replication. Here is the 
explanation from the old JIRA.

https://issues.apache.org/jira/browse/SOLR-10233
{quote}
h3. Passive replica dies (or is unreachable)

Replica won’t be query-able. On restart, replica will recover from the leader, 
following the same flow as _realtime_ replicas: set state to DOWN, then 
RECOVERING, and finally ACTIVE. _Passive_ replicas will use a different 
{{RecoveryStrategy}} implementation, that omits *preparerecovery,* and peer 
sync attempt, it will jump to replication . If the leader didn't change, or if 
the other replicas are of type “append”, replication should be incremental. 
Once the first replication is done, passive replica will declare itself active 
and start serving traffic.
{quote}
*RecoveryStrategy.java*
{noformat}
log.info("Stopping background replicate from leader process");
zkController.stopReplicationFromLeader(coreName);
replicate(zkController.getNodeName(), core, leaderprops);{noformat}
My own theory:
 # RecoveryStrategy cancel replication.
 # FileFetcher#fetchPackets throws ReplicationHandlerException

 
{code:java}
if (stop) {
  stop = false;
  aborted = true;
  throw new ReplicationHandlerException("User aborted replication");
}{code}

 # FileFetcher#fetch runs finally block where the sync is executed in async

 
{code:java}
fsyncService.submit(() -> {
  try {
file.sync();
  } catch (IOException e) {
fsyncException = e;
  } catch (InterruptedException e) {
throw new RuntimeException(e);
  }
});{code}

 # At the same time the control gets back to fetchLatestIndex that performs 
cleanup and closed the directory

{code:java}
finally {
  if (!cleanupDone) {
  cleanup(solrCore, tmpIndexDir, indexDir, deleteTmpIdxDir, tmpTlogDir, 
successfulInstall);
 }
}{code}
And basically there is race condition between step 3 and 4 that's what I 
believe. Not able to reproduce on my system yet.

 

> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Sanjay Dutt
>Priority: Major
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through 

[jira] [Created] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2024-10-25 Thread Jason Gerlowski (Jira)
Jason Gerlowski created SOLR-17515:
--

 Summary: Recovery fails in Solr 9.7.0 if basic-auth is enabled
 Key: SOLR-17515
 URL: https://issues.apache.org/jira/browse/SOLR-17515
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 9.7
Reporter: Jason Gerlowski


Several reporters on the users@ list, recently shared a bug they noticed on 
upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
NullPointerException:

{code}
2024-09-18 09:36:31.238 ERROR 
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Error while trying to recover. 
core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
java.lang.NullPointerException: Cannot invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:212)
 ~[metrics-core-4.2.26.jar:4.2.26]
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
 ~[?:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) 
~[?:?]
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
 ~[?:?]
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
 ~[?:?]
at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
2024-09-18 09:36:31.238 ERROR 
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2024-09-18 09:36:31.238 INFO  
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Wait [4] seconds before trying to recover again 
(attempt=1)
{code}

It turns out that the issue isn't specific 

[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-10-25 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17893024#comment-17893024
 ] 

David Smiley commented on SOLR-17497:
-

I'm confused; is this one JIRA issue about two different exception?

> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>Reporter: Sanjay Dutt
>Priority: Major
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through one of the test case execution step.
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
>  
> Test flow
> 1. Create a collection with 1 NRT and 1 PULL replica
> 2. waitForState
> 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all*
> 4. Index another document.
> 5. waitForNumDocsInAllActiveReplicas(1);
> 6. Stop Pull replica
> 7. Index another document
> 8. waitForNumDocsInAllActiveReplicas(2);
> 9. Start Pull Replica
> 10. waitForState
> 11. waitForNumDocsInAllActiveReplicas(2);
>  
> As per the logs the whole sequence executed successfully. Here is the link to 
> the logs: 
> [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output]
>  (link may stop working in the future)
>  
> Last step where they are making sure that all the active replicas should have 
> two documents each has logged a info which is another proof that it completed 
> successfully. 
>  
> {code:java}
> 616575 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 
> (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/)
>  has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) 
> [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 
> r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 
> t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select 
> params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=0 
> 616607 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node4 
> (https://127.0.0.1:38207/solr/pull_replica_test_kill_pull_replica_shard1_replica_p2/)
>  has all 2 docs{code}
>  
> *Where is the issue then?*
> In the logs it has been observed, that after restarting the PULL replica. The 
> recovery process started and after fetching all the files info from the NRT, 
> the replication aborted and logged "User aborted replication"
>  
> {code:java}
> o.a.s.h.IndexFetcher User aborted Replication => 
> org.apache.solr.handler.IndexFetcher$ReplicationHandlerException: User 
> aborted replication at 
> org.apache.so

[PR] Fix SolrJmxReporterTest#testClosedCore [solr]

2024-10-25 Thread via GitHub


iamsanjay opened a new pull request, #2797:
URL: https://github.com/apache/solr/pull/2797

   This PR addresses a race condition in the code where a separate thread 
continuously retrieves attributes from an MBean, while the main thread may 
unload the MBean before the retrieval thread has fully terminated.
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my 
code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended, not available for 
branches on forks living under an organisation)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16962: Restore ability to configure tlog directory [solr]

2024-10-25 Thread via GitHub


iamsanjay commented on PR #1895:
URL: https://github.com/apache/solr/pull/1895#issuecomment-2439282497

   git bisect points to this PR. Require bit more attention to see whether this 
PR causing it or not.
   
   **org.apache.solr.search.TestCollapseQParserPlugin.testMultiSort 
(:solr:core)**
 ```
 Test history: 
https://ge.apache.org/scans/tests?search.rootProjectNames=solr-root&tests.container=org.apache.solr.search.TestCollapseQParserPlugin&tests.test=testMultiSort
 
http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.search.TestCollapseQParserPlugin.testMultiSort
   Test output: 
/Users/sanjaydutt/Documents/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.search.TestCollapseQParserPlugin.txt
   Reproduce with: ./gradlew :solr:core:test --tests 
"org.apache.solr.search.TestCollapseQParserPlugin.testMultiSort" -Ptests.jvms=4 
-Ptests.haltonfailure=false "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 
-XX:+UseParallelGC -XX:ActiveProcessorCount=1 -XX:ReservedCodeCacheSize=120m" 
-Ptests.seed=F8EF1414D2733583 -Ptests.multiplier=2 -Ptests.badapples=false 
-Ptests.file.encoding=US-ASCII
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16116: Use apache curator to manage the Solr Zookeeper interactions [solr]

2024-10-25 Thread via GitHub


HoustonPutman commented on code in PR #760:
URL: https://github.com/apache/solr/pull/760#discussion_r1817369402


##
solr/test-framework/build.gradle:
##
@@ -43,6 +43,17 @@ dependencies {
   var zkExcludes = {
 exclude group: "org.apache.yetus", module: "audience-annotations"
   }
+  api('org.apache.curator:curator-client', {

Review Comment:
   I've changed all curator deps here to "implementation"



##
solr/solrj-zookeeper/build.gradle:
##
@@ -32,6 +32,13 @@ dependencies {
 
 implementation project(':solr:solrj')
 
+api('org.apache.curator:curator-client', {

Review Comment:
   Actually implementation should be ok for the `curator-client`. For 
`curator-framework`, that wouldn't be great because of the 
`SolrZkClient.multi()` function parameters.



##
gradle/testing/randomization/policies/solr-tests.policy:
##
@@ -50,6 +50,7 @@ grant {
   permission java.net.SocketPermission "127.0.0.1:4", "connect,resolve";
   permission java.net.SocketPermission "127.0.0.1:6", "connect,resolve";
   permission java.net.SocketPermission "127.0.0.1:8", "connect,resolve";
+  permission java.net.SocketPermission "--", "connect,resolve";

Review Comment:
   It's a fake ZK host used in a test, just like all of the other ones. But I 
added a comment



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16470: Create V2 equivalent of V1 Replication: Get files/{filePath} [solr]

2024-10-25 Thread via GitHub


gerlowskija commented on code in PR #2734:
URL: https://github.com/apache/solr/pull/2734#discussion_r1817490345


##
solr/core/src/java/org/apache/solr/handler/admin/api/CoreReplicationAPI.java:
##
@@ -68,6 +71,45 @@ public FileListResponse fetchFileList(
 return doFetchFileList(gen);
   }
 
+  @GET

Review Comment:
   OK, I'll leave you to it, but if you have any questions or get stuck, lmk 
and I'll try to help out!
   
   (Hope you had a great vacation!)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17497) Pull replicas throws AlreadyClosedException

2024-10-25 Thread Sanjay Dutt (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17893025#comment-17893025
 ] 

Sanjay Dutt commented on SOLR-17497:


Sorry, Initially I had no idea what's going on so i shared whatever I can found 
here in this JIRA. There is only one exception that is relevant – 
AlreadyClosedException. The other one "User aborted Replication" is expected 
and observed whenever the replication is aborted. Even when you run 
org.apache.solr.cloud.TestPullReplica.testKillPullReplica, you will see this 
exception in the logs and that's fine IMO. 

 

> Pull replicas throws AlreadyClosedException  
> -
>
> Key: SOLR-17497
> URL: https://issues.apache.org/jira/browse/SOLR-17497
> Project: Solr
>  Issue Type: Task
>Reporter: Sanjay Dutt
>Priority: Major
> Attachments: Screenshot 2024-10-23 at 6.01.02 PM.png
>
>
> Recently, a common exception (org.apache.lucene.store.AlreadyClosedException: 
> this Directory is closed) seen in multiple failed test cases. 
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
> FAILED:  
> org.apache.solr.cloud.SplitShardWithNodeRoleTest.testSolrClusterWithNodeRoleWithPull
> FAILED:  org.apache.solr.cloud.TestPullReplica.testAddDocs
>  
>  
> {code:java}
> com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an 
> uncaught exception in thread: Thread[id=10271, 
> name=fsyncService-6341-thread-1, state=RUNNABLE, 
> group=TGRP-SplitShardWithNodeRoleTest]
>         at 
> __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4:E5DB3E97188A8EB9]:0)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this Directory is 
> closed
>         at __randomizedtesting.SeedInfo.seed([3F7DACB3BC44C3C4]:0)
>         at 
> app//org.apache.lucene.store.BaseDirectory.ensureOpen(BaseDirectory.java:50)
>         at 
> app//org.apache.lucene.store.ByteBuffersDirectory.sync(ByteBuffersDirectory.java:237)
>         at 
> app//org.apache.lucene.tests.store.MockDirectoryWrapper.sync(MockDirectoryWrapper.java:214)
>         at 
> app//org.apache.solr.handler.IndexFetcher$DirectoryFile.sync(IndexFetcher.java:2034)
>         at 
> app//org.apache.solr.handler.IndexFetcher$FileFetcher.lambda$fetch$0(IndexFetcher.java:1803)
>         at 
> app//org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at 
> java.base@11.0.24/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base@11.0.24/java.lang.Thread.run(Thread.java:829)
>  {code}
>  
> Interesting thing about these test cases is that they all share same kind of 
> setup where each has one shard and two replicas – one NRT and another is PULL.
>  
> Going through one of the test case execution step.
> FAILED:  org.apache.solr.cloud.TestPullReplica.testKillPullReplica
>  
> Test flow
> 1. Create a collection with 1 NRT and 1 PULL replica
> 2. waitForState
> 3. waitForNumDocsInAllActiveReplicas(0); // *Name says it all*
> 4. Index another document.
> 5. waitForNumDocsInAllActiveReplicas(1);
> 6. Stop Pull replica
> 7. Index another document
> 8. waitForNumDocsInAllActiveReplicas(2);
> 9. Start Pull Replica
> 10. waitForState
> 11. waitForNumDocsInAllActiveReplicas(2);
>  
> As per the logs the whole sequence executed successfully. Here is the link to 
> the logs: 
> [https://ge.apache.org/s/yxydiox3gvlf2/tests/task/:solr:core:test/details/org.apache.solr.cloud.TestPullReplica/testKillPullReplica/1/output]
>  (link may stop working in the future)
>  
> Last step where they are making sure that all the active replicas should have 
> two documents each has logged a info which is another proof that it completed 
> successfully. 
>  
> {code:java}
> 616575 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node3 
> (https://127.0.0.1:35647/solr/pull_replica_test_kill_pull_replica_shard1_replica_n1/)
>  has all 2 docs 616606 INFO (qtp1091538342-13057-null-11348) 
> [n:127.0.0.1:38207_solr c:pull_replica_test_kill_pull_replica s:shard1 
> r:core_node4 x:pull_replica_test_kill_pull_replica_shard1_replica_p2 
> t:null-11348] o.a.s.c.S.Request webapp=/solr path=/select 
> params={q=*:*&wt=javabin&version=2} rid=null-11348 hits=2 status=0 QTime=0 
> 616607 INFO 
> (TEST-TestPullReplica.testKillPullReplica-seed#[F30CC837FDD0DC28]) [n: c: s: 
> r: x: t:] o.a.s.c.TestPullReplica Replica core_node4 
> (https://127.0.0.1:38207/solr/pull_replica_test_kill_pull_replica_shard1_replica_p2/)
>  has all 2 docs{code}
>  
> *Where is the issue then?*
> In the logs it has been observed, that after restarting th

[jira] [Updated] (SOLR-16962) updateLog tlog dir location config is silently ignored

2024-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-16962:
--
Labels: pull-request-available  (was: )

> updateLog tlog dir location config is silently ignored 
> ---
>
> Key: SOLR-16962
> URL: https://issues.apache.org/jira/browse/SOLR-16962
> Project: Solr
>  Issue Type: Bug
>Affects Versions: main (10.0), 9.2.1
>Reporter: Michael Gibney
>Assignee: Michael Gibney
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 9.7
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> If you follow the 
> [instructions|https://solr.apache.org/guide/solr/latest/configuration-guide/commits-transaction-logs.html#transaction-log]
>  on configuring a non-default tlog location, solr currently silently ignores 
> explicit configuration and uses the default location 
> {{[instanceDir]/data/tlog/}}.
> Afaict this has been the case for some time, with several layers of faithful 
> refactorings now somewhat obscuring the initial intent.
> This issue proposes to restore the initial intent, and also shore up some of 
> the nuances of handling this (now that the config actually has an effect):
> # resolve relative "dir" spec relative to core instanceDir
> # disallow relative "dir" spec that escapes core instanceDir (e.g., 
> {{dir=../../some_path}})
> # for absolute "dir" spec outside of the core instanceDir, scope the tlog dir 
> by core name



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-17511) CLI: Resole -i conflicts (async-id, cluster-id)

2024-10-25 Thread Christos Malliaridis (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892754#comment-17892754
 ] 

Christos Malliaridis commented on SOLR-17511:
-

That is right, thanks for the correction. 😅

> CLI: Resole -i conflicts (async-id, cluster-id)
> ---
>
> Key: SOLR-17511
> URL: https://issues.apache.org/jira/browse/SOLR-17511
> Project: Solr
>  Issue Type: Sub-task
>  Components: cli
>Affects Versions: 9.7, 9.6.1
>Reporter: Christos Malliaridis
>Priority: Minor
>  Labels: cli
>
> The CLI flag {{\-i}} is currently used in two options:
> - for {{async-id}} in SnapshotExportTool for specifying an asynchronous 
> request identifier
> - for {{cluster-id}} in SolrExporter for specifying a unique cluster 
> identifier
> Since both short options are not obvious and the letter {{i}} may be used in 
> another context in the future, we should reserve it and deprecate (9.8) / 
> remove (10.0) it from the above options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17511) CLI: Resole -i conflicts (async-id, cluster-id)

2024-10-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-17511:
--
Labels: cli pull-request-available  (was: cli)

> CLI: Resole -i conflicts (async-id, cluster-id)
> ---
>
> Key: SOLR-17511
> URL: https://issues.apache.org/jira/browse/SOLR-17511
> Project: Solr
>  Issue Type: Sub-task
>  Components: cli
>Affects Versions: 9.7, 9.6.1
>Reporter: Christos Malliaridis
>Priority: Minor
>  Labels: cli, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The CLI flag {{\-i}} is currently used in two options:
> - for {{async-id}} in SnapshotExportTool for specifying an asynchronous 
> request identifier
> - for {{cluster-id}} in SolrExporter for specifying a unique cluster 
> identifier
> Since both short options are not obvious and the letter {{i}} may be used in 
> another context in the future, we should reserve it and deprecate (9.8) / 
> remove (10.0) it from the above options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Resolved] (SOLR-17488) CLI: Resolve -d conflicts

2024-10-25 Thread Eric Pugh (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh resolved SOLR-17488.
--
Fix Version/s: 9.8
   Resolution: Fixed

> CLI: Resolve -d conflicts
> -
>
> Key: SOLR-17488
> URL: https://issues.apache.org/jira/browse/SOLR-17488
> Project: Solr
>  Issue Type: Sub-task
>  Components: cli
>Affects Versions: 9.7, 9.6.1
>Reporter: Christos Malliaridis
>Assignee: Eric Pugh
>Priority: Major
>  Labels: cli, pull-request-available
> Fix For: 9.8
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The CLI flag {{\-d}} is currently used in four options:
> - {{conf-dir}} for providing the configuration directory in CreateTool, 
> ConfigSetDownloadTool, ConfigSetUploadTool, ZKCLI
> - {{delete-config}} (with argument) for deleting configurations together with 
> collections in DeleteTool, defualts to {{true}}
> - {{server-dir}} for defining the Solr root / server directory in 
> RunExampleTool
> - {{delay}} for delaying recursive posts in PostTool
> *Proposed Resolution*
> In order to avoid confusion of the {{\-d}} flag, the following changes are 
> proposed:
> - keep {{\-d}} for {{conf-dir}} in CreateTool, ConfigSetDownloadTool, 
> ConfigSetUploadTool, ZKCLI
> - Deprecated (9.8) and remove (10.0) the {{delete-config}} option by 
> replacing it with {{keep}} ({{\-\-keep}} without arguments) for simplifying 
> and improving user experience and avoid conflict of {{\-d}}. "{{\-\-keep}}" 
> should behave equivalent to "{{\-\-delete-config false}}".
> - Deprecate (9.8) and remove (10.0) {{\-d}} for {{server-dir}} in 
> RunExmapleTool. Note that {{\-\-server-dir}} may be removed or renamed to use 
> better wording in the future.
> - Support {{\-\-server-dir}} in {{bin/solr}} and if necessary 
> {{bin/solr.cmd}} in version 9.8 and 10.0
> - Deprecate (9.8) and remove (10.0) {{\-d}} for {{delay}} in PostTool to 
> avoid any confusion with {{conf-dir}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Updated] (SOLR-17511) CLI: Resolve -i conflicts (async-id, cluster-id)

2024-10-25 Thread Eric Pugh (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh updated SOLR-17511:
-
Summary: CLI: Resolve -i conflicts (async-id, cluster-id)  (was: CLI: 
Resole -i conflicts (async-id, cluster-id))

> CLI: Resolve -i conflicts (async-id, cluster-id)
> 
>
> Key: SOLR-17511
> URL: https://issues.apache.org/jira/browse/SOLR-17511
> Project: Solr
>  Issue Type: Sub-task
>  Components: cli
>Affects Versions: 9.7, 9.6.1
>Reporter: Christos Malliaridis
>Assignee: Eric Pugh
>Priority: Minor
>  Labels: cli, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The CLI flag {{\-i}} is currently used in two options:
> - for {{async-id}} in SnapshotExportTool for specifying an asynchronous 
> request identifier
> - for {{cluster-id}} in SolrExporter for specifying a unique cluster 
> identifier
> Since both short options are not obvious and the letter {{i}} may be used in 
> another context in the future, we should reserve it and deprecate (9.8) / 
> remove (10.0) it from the above options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Assigned] (SOLR-17511) CLI: Resole -i conflicts (async-id, cluster-id)

2024-10-25 Thread Eric Pugh (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Pugh reassigned SOLR-17511:


Assignee: Eric Pugh

> CLI: Resole -i conflicts (async-id, cluster-id)
> ---
>
> Key: SOLR-17511
> URL: https://issues.apache.org/jira/browse/SOLR-17511
> Project: Solr
>  Issue Type: Sub-task
>  Components: cli
>Affects Versions: 9.7, 9.6.1
>Reporter: Christos Malliaridis
>Assignee: Eric Pugh
>Priority: Minor
>  Labels: cli, pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The CLI flag {{\-i}} is currently used in two options:
> - for {{async-id}} in SnapshotExportTool for specifying an asynchronous 
> request identifier
> - for {{cluster-id}} in SolrExporter for specifying a unique cluster 
> identifier
> Since both short options are not obvious and the letter {{i}} may be used in 
> another context in the future, we should reserve it and deprecate (9.8) / 
> remove (10.0) it from the above options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Comment Edited] (SOLR-6122) API to cancel an already submitted/running Collections API call

2024-10-25 Thread Yuntong Qu (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892240#comment-17892240
 ] 

Yuntong Qu edited comment on SOLR-6122 at 10/25/24 2:54 PM:


Made a POC PR using deleteStatus to remove not-started tasks. Using delete 
status endpoint to forcefully delete not-started tracking. And after cancel, 
the task will not be present in failure/completed map. 
TBH, I don't particular love this solution as it limit us forward to do cancel 
in-progress. But I also want to get some opinions on this. 

–
One of the main problem I am having rn is to deal with OverseerTaskProcesser 
keeing below in-memory data structure  
 - runningZKTasks (Set of tasks that have been picked up for processing but not 
cleaned up from ZK work-queue)

 - blockedTasks (contain tasks which are read from work queue but could not be 
executed because they are blocked or the execution queue is full)

With above 2 data structure, overseer will not have real time view of what's 
happening on ZK queue ( which is an optimization to reduce ZK read ). 

I am working on another way to add cancel task to _*collection-queue-work*_ and 
a new _OverseerMessageHandler_ to handle cancel task specific (instead of using 
OverseerCollectionMessageHandler), and let that cancel message handler modify 
ZK queue and in-memory tracking for Overseer

–
Re [~gerlowskija] on order of cancel:
 - If we send cancel task to _*collection-queue-work,*_ there are still chances 
that the cancel won't be picked up, since in OverseerTaskProcessor we limit num 
of task picked up from the queue, and if we exceed MAX_BLOCKED_TASKS, no new 
tasks will be picked up. And if there many running task exceeding or 
MAX_PARALLEL_TASKS, no new cancel tasks can be started. 

 - after a cancel task is being picked up in OverseerTaskProcessor, from my 
reading of the coding, each queue item will spun up another Runner thread to 
handle each task, so the processing of queued item should be quite fast since 
it's non-blocking.
 - Also locking is on different level (replica/shard/collection), thus if we 
make cancel task require no lock, cancel can be educated earlier
 - As long as we make sure that when cancel task is executing. it has real time 
view of ZK queue, it should not mistaken a started task as pending task  

 - To completely eliminate the concern of cancel task not being handle ASAP 
when submitted, in my mind, the best approach is to have a another queue to 
take in cancel task requests. 

Trade off here is complexity, but submitting to _*collection-queue-work*_ 
should mostly work. Maybe an improvement will be to add a new queue if needed

 


was (Author: yuntong):
Made a POC PR using deleteStatus to remove not-started tasks. Using delete 
status endpoint to forcefully delete not-started tracking. And after cancel, 
the task will not be present in failure/completed map. 
TBH, I don't particular love this solution as it limit us forward to do cancel 
in-progress. But also I want to get some opinions on this. 

–
One of the main problem I am having rn is to deal with OverseerTaskProcesser 
keeing below in-memory data structure  
 - runningZKTasks (Set of tasks that have been picked up for processing but not 
cleaned up from zk work-queue)

 - blockedTasks (contain tasks which are read from work queue but could not be 
executed because they are blocked or the execution queue is full)

With above 2 data structure, overseer will not have real time view of what's 
happening on ZK queue ( which is an optimization to reduce ZK read ). 

I am working on another way to add cancel task to _*collection-queue-work*_ and 
a new _OverseerMessageHandler_ to handle cancel task specific (instead of using 
OverseerCollectionMessageHandler), and let that cancel message handler modify 
ZK queue and in-memory tracking for Overseer

–
Re [~gerlowskija] on order of cancel:
 - If we send cancel task to _*collection-queue-work,*_ there are still chances 
that the cancel won't be picked up, since in OverseerTaskProcessor we limit num 
of task picked up from the queue, and if we exceed MAX_BLOCKED_TASKS, no new 
tasks will be picked up. And if there many running task exceeding or 
MAX_PARALLEL_TASKS, no new cancel tasks can be started. 

 - after a cancel task is being picked up in OverseerTaskProcessor, from my 
reading of the coding, each queue item will spun up another Runner thread to 
handle each task, so the processing of queued item should be quite fast since 
it's non-blocking.
 - Also locking is on different level (replica/shard/collection), thus if we 
make cancel task require no lock, cancel can be excuated earlier
 - As long as we make sure that when cancel task is executing. it has real time 
view of ZK queue, it should not mistaken a started task as pending task  

 - To completely elimiate the concern of cancle task not beeing handle ASAP 
wh

[jira] [Assigned] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2024-10-25 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski reassigned SOLR-17515:
--

Assignee: Jason Gerlowski

> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
> during replica recovery*.  The result is that replicas will fail to recover, 
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following 
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p 
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recovery by turning the node back on
> bin/solr auth enable -type basicAuth -credentials solr:solrRocks 
> -blockUnknown true
> # This line will need tweaked based on which Solr node was previously stopped
> "bin/solr" start --cloud -p  -s "example/cloud//solr" -z 
> 127.0.0.1:9983
> {code}



--
This message was sent

[jira] [Commented] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2024-10-25 Thread Sanjay Dutt (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892962#comment-17892962
 ] 

Sanjay Dutt commented on SOLR-17515:


Thank you so much [~gerlowskija] for reproducing it and providing all the 
details. Though I am bit confused with all the different auth mechanism we have 
in place. Even last time two auth cases found for which new test case were 
added. Clearly, more test cases are required. Going to work on this one unless 
you are already on it. 

 

> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
> during replica recovery*.  The result is that replicas will fail to recover, 
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following 
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p 
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recove

[jira] [Commented] (SOLR-17515) Recovery fails in Solr 9.7.0 if basic-auth is enabled

2024-10-25 Thread Sanjay Dutt (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892967#comment-17892967
 ] 

Sanjay Dutt commented on SOLR-17515:


We both updated it same time. That's great! Yes go ahead and take it, and 
meanwhile I will try to see If why my old test case were not able to caught 
this one, and try to update them. 

> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 9.7
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
> during replica recovery*.  The result is that replicas will fail to recover, 
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following 
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p 
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recovery by turning the node back on
> bin/solr auth enable -type basicAuth -credentials solr:solrRocks 
> -blockUnknown true
> # This line will need tweaked based on which Solr node was previo

[PR] Fix release wizard to remove from Solr space before attempting Lucene [solr]

2024-10-25 Thread via GitHub


anshumg opened a new pull request, #2796:
URL: https://github.com/apache/solr/pull/2796

   We should just remove attempting to cleanup the Lucene space once we do the 
Solr 10.0 release. Right now the wizard fails when it tries to cleanup the 
Lucene space because the 9x Solr releases are not found there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] EOL Solr 8 [solr-site]

2024-10-25 Thread via GitHub


HoustonPutman commented on PR #131:
URL: https://github.com/apache/solr-site/pull/131#issuecomment-2438874511

   Ok, revised the two sentences. Happy to change it to whatever if you still 
don't like it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] EOL Solr 8 [solr-site]

2024-10-25 Thread via GitHub


anshumg commented on code in PR #131:
URL: https://github.com/apache/solr-site/pull/131#discussion_r1817178495


##
content/solr/solr_news/2024-10-25-solr8-eol.md:
##
@@ -0,0 +1,8 @@
+Title: Solr 8 reaches End-Of-Life
+category: solr/news
+save_as:
+
+After the release of Solr 8.11.4, the Apache Solr community will no longer 
provide support for Solr 8.11.
+With Lucene 10 having been released, and therefore Lucene 8.11 reaching EOL, 
the Apache Lucene and Solr community are no longer able to provide new releases 
for Solr 8.

Review Comment:
   This has a lot of overlap with the previous sentence, right?



##
content/solr/solr_news/2024-10-25-solr8-eol.md:
##
@@ -0,0 +1,8 @@
+Title: Solr 8 reaches End-Of-Life
+category: solr/news
+save_as:
+
+After the release of Solr 8.11.4, the Apache Solr community will no longer 
provide support for Solr 8.11.

Review Comment:
   Let's change that to say 8x instead of 8.11.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16390: v2 Cluster Property APIs. [solr]

2024-10-25 Thread via GitHub


gerlowskija commented on PR #2788:
URL: https://github.com/apache/solr/pull/2788#issuecomment-2438568772

   Still going through the individual files on this PR, but wanted to respond 
to some of the high-level comments first:
   
   > Wasn't sure how to model the response. Different APIs use different error 
for "does not exist" responses: /api/collections/collectionName returns a 400, 
/api/aliases/specificalias returns a 405, 
/solr/collectionName/schema/fields/fieldName returns a 404
   
   I'm personally a fan of 404 in this case as it seems a little more 
actionable for users than the more generic '400', so that'd be my preference.  
But I don't have any strong feelings on that point, and would be open to 
something else if you do have preferences?  When we decide, we should document 
the decision in `dev-docs/v2-api-conventions.adoc` so there's a "standard" we 
can align on.
   
   (I suspect the 405 returned by `GET /api/aliases/nonexistentAlias` is a bug, 
FWIW.  Will have to file a ticket for that if I can reproduce...)
   
   > I wasn't sure if that would be preferred over grouping them all together 
as was done with AliasPropertyApis/AliasProperty
   
   I prefer grouping related APIs into a single file, at least on the 'api' 
side.  IMO it cuts down on boilerplate, and makes reviewing and browsing easier 
by keeping a bunch of related definitions together.  But again, it's a very 
slight preference on my end if you happen to prefer the alternative.
   
   > The new v2 JAX-RS Bulk Update ClusterProp API requires providing a body 
that looks like {"properties":{"actualPropertyToBeUpdated":...}} because I 
didn't know how to map an unknown top-level value
   
   Hmm - I think you should be able to nuke 
`SetNestedClusterPropertyRequestBody` altogether, and replace it in the method 
signature with `Map`? e.g.
   
   ```
 @PUT
 @Operation(
 summary = "Set nested cluster properties in this Solr cluster",
 tags = {"cluster-properties"})
 SolrJerseyResponse createOrUpdateNestedClusterProperty(
 @RequestBody(description = "Property/ies to be set", required = true)
 Map propertyValuesByName)
   ```
   
   Or does that break something or other that I've forgotten about?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[PR] EOL Solr 8 [solr-site]

2024-10-25 Thread via GitHub


HoustonPutman opened a new pull request, #131:
URL: https://github.com/apache/solr-site/pull/131

   Made a small news page, and changed the downloads to state 8.11 is EOL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Created] (SOLR-17516) LBHttpSolrClient: support HttpJdkSolrClient

2024-10-25 Thread James Dyer (Jira)
James Dyer created SOLR-17516:
-

 Summary: LBHttpSolrClient: support HttpJdkSolrClient
 Key: SOLR-17516
 URL: https://issues.apache.org/jira/browse/SOLR-17516
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Reporter: James Dyer


With SOLR-599 we added a new SolrJ client *HttpJdkSolrClient* which uses 
java.net.http.HttpClient internally.  We can also support load balancing.  This 
ticket is to factor out common functionality from the existing 
*LBHttp2SolrClient*, creating a new sibling class *LBHttpJdkSolrClient*.

This is a prerequisite for having a version of *CloudSolrClient* that works 
with *HttpJdkSolrClient*.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



Re: [PR] SOLR-16390: v2 Cluster Property APIs. [solr]

2024-10-25 Thread via GitHub


gerlowskija commented on code in PR #2788:
URL: https://github.com/apache/solr/pull/2788#discussion_r1817160658


##
solr/api/src/java/org/apache/solr/client/api/endpoint/SetClusterPropertyApi.java:
##
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.client.api.endpoint;
+
+import io.swagger.v3.oas.annotations.Operation;
+import io.swagger.v3.oas.annotations.Parameter;
+import io.swagger.v3.oas.annotations.parameters.RequestBody;
+import jakarta.ws.rs.PUT;
+import jakarta.ws.rs.Path;
+import jakarta.ws.rs.PathParam;
+import org.apache.solr.client.api.model.SetClusterPropertyRequestBody;
+import org.apache.solr.client.api.model.SolrJerseyResponse;
+
+@Path("/cluster/properties/{propertyName}")
+public interface SetClusterPropertyApi {
+
+  @PUT
+  @Operation(
+  summary = "Set a cluster property in this Solr cluster",

Review Comment:
   ```suggestion
 summary = "Set a single new or existing cluster property in this Solr 
cluster",
   ```



##
solr/api/src/java/org/apache/solr/client/api/endpoint/SetNestedClusterPropertyApi.java:
##
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.client.api.endpoint;
+
+import io.swagger.v3.oas.annotations.Operation;
+import io.swagger.v3.oas.annotations.parameters.RequestBody;
+import jakarta.ws.rs.PUT;
+import jakarta.ws.rs.Path;
+import org.apache.solr.client.api.model.SetNestedClusterPropertyRequestBody;
+import org.apache.solr.client.api.model.SolrJerseyResponse;
+
+@Path("/cluster/properties")
+public interface SetNestedClusterPropertyApi {
+
+  @PUT
+  @Operation(
+  summary = "Set nested cluster properties in this Solr cluster",
+  tags = {"cluster-properties"})
+  SolrJerseyResponse createOrUpdateNestedClusterProperty(

Review Comment:
   [Q] It's interesting that this API is both the only way to set 
"nested"/complex cluster properties, and the only way to set multiple 
properties simultaneously.
   
   I guess that's fine, since it mirrors what's supported in v1?  I don't 
really have a question or suggestion here, mostly just making a note of it...



##
solr/core/src/test/org/apache/solr/handler/admin/api/ClusterPropsAPITest.java:
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.handler.admin.api;
+
+import static org.apache.solr.common.util.Utils.getObjectByPath;
+
+import java.net.URL;
+import java.util.List;
+import org.apache.http.HttpResponse;
+import org.apache.http.client.methods.HttpDelete;
+import org.apache.http.client.methods.HttpGet;
+import org.apache.http.client.methods.HttpPut;
+