dsmiley commented on code in PR #3891:
URL: https://github.com/apache/solr/pull/3891#discussion_r2659451629


##########
changelog/unreleased/SOLR-18002-add-unresponsive-servers-to-zombie-list.yml:
##########
@@ -0,0 +1,8 @@
+# See https://github.com/apache/solr/blob/main/dev-docs/changelog.adoc
+title: Add unresponsive servers to zombie list

Review Comment:
   ```suggestion
   title: CloudSolrClient/LBSolrClient should consider a retry-able request 
that times out as another condition to internally mark that replica as a 
"zombie".
   ```
   
   I wish 'title' was named 'description' or 'summary'.  'title' suggests 
something shorter.



##########
solr/solrj/src/test/org/apache/solr/client/solrj/impl/LB2SolrClientTest.java:
##########
@@ -330,4 +356,57 @@ public void close() {
       }
     }
   }
+
+  private class ZombieTestContext implements AutoCloseable {
+    final ServerSocket blackhole;
+    final LBSolrClient.Endpoint nonRoutableEndpoint;
+    final HttpJettySolrClient delegateClient;
+    final LBAsyncSolrClient lbClient;
+
+    ZombieTestContext() throws Exception {
+      //create a socket that allows a client to connect but causes them to 
hang until idleTimeout is triggered
+      blackhole = new ServerSocket(0);
+      int blackholePort = blackhole.getLocalPort();
+      nonRoutableEndpoint =
+          new LBSolrClient.Endpoint("http://localhost:"; + blackholePort + 
"/solr");
+
+      delegateClient =
+          new HttpJettySolrClient.Builder()
+              .withConnectionTimeout(1000, TimeUnit.MILLISECONDS)
+              .withIdleTimeout(100, TimeUnit.MILLISECONDS)

Review Comment:
   I suppose an even smaller number would work too, like 1 millisecond?



##########
solr/solrj/src/test/org/apache/solr/client/solrj/impl/LB2SolrClientTest.java:
##########
@@ -202,6 +205,29 @@ public void testTwoServers() throws Exception {
     }
   }
 
+  public void testTimeoutExceptionMarksServerAsZombie() throws Exception {
+    try (ZombieTestContext ctx = new ZombieTestContext()) {

Review Comment:
   it's not apparent looking at this, that ZombieTestContext will induce a 
timeout.  That seems to be its fundamental purpose and thus is associated with 
just a couple tests.  At a glance, it appears to be a test utility I otherwise 
have never heard of.  A reviewer (me) has to then go look at the code for it, 
and infer some things to come to that conclusion.  If you prefix the name with 
`Timeout` then it'd help.



##########
changelog/unreleased/SOLR-18002-add-unresponsive-servers-to-zombie-list.yml:
##########
@@ -0,0 +1,8 @@
+# See https://github.com/apache/solr/blob/main/dev-docs/changelog.adoc
+title: Add unresponsive servers to zombie list
+type: fixed # added, changed, fixed, deprecated, removed, dependency_update, 
security, other

Review Comment:
   I view this as an improvement, thus "changed", rather than a bug fix.  WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to