bruno-roustant commented on a change in pull request #580:
URL: https://github.com/apache/solr/pull/580#discussion_r797347357



##########
File path: solr/core/src/java/org/apache/solr/core/SolrCores.java
##########
@@ -377,6 +370,12 @@ protected CoreDescriptor getUnloadedCoreDescriptor(String 
cname) {
     }
   }
 
+  boolean hasPendingCoreOps(String name) {

Review comment:
       Should it be protected?

##########
File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java
##########
@@ -2084,12 +2084,13 @@ public boolean isLoaded(String name) {
     return solrCores.isLoaded(name);
   }
 
-  public boolean isLoadedNotPendingClose(String name) {
-    return solrCores.isLoadedNotPendingClose(name);
+  /** The core is loading, unloading, or reloading. */
+  boolean hasPendingCoreOps(String name) {

Review comment:
       Shouldn't it be public? This seems useful information.

##########
File path: solr/core/src/test/org/apache/solr/core/TestLazyCores.java
##########
@@ -899,4 +890,67 @@ private void check10(SolrCore core) {
         , "//result[@numFound='10']"
     );
   }
+
+  public void testDontEvictUsedCore() throws Exception {
+    // If a core is being used for a long time (say a long indexing batch) but 
nothing else for it,
+    // and if the transient cache has pressure and thus wants to unload a 
core, we should not
+    // unload it (yet).
+
+    CoreContainer cc = init();
+    String[] transientCoreNames = new String[]{
+        "collection2",
+        "collection3",
+        "collection6",
+        "collection7",
+        "collection8",
+        "collection9"
+    };
+
+    try (LogListener logs = 
LogListener.info(TransientSolrCoreCacheDefault.class.getName())
+        .substring("NOT evicting transient core [" + transientCoreNames[0] + 
"]")) {
+      cc.waitForLoadingCoresToFinish(1000);
+      var solr = new EmbeddedSolrServer(cc, null);
+      final var longReqTimeMs = 2000; // if lower too much, the test will fail 
on a slow/busy CI
+
+      // First, start a long request on the first transient core
+      var thread = new Thread(() -> {
+        try {
+          // TODO Solr ought to have a query test "latch" mechanism so we 
don't sleep arbitrarily
+          solr.query(transientCoreNames[0], params("q", "{!func}sleep(" + 
longReqTimeMs + ",1)"));
+        } catch (SolrServerException | IOException e) {
+          fail(e.toString());
+        }
+      }, "longRequest");
+      thread.start();
+
+      System.out.println("Inducing pressure on cache by querying many 
cores...");
+      // Now hammer on other transient cores to create transient cache pressure
+      for (int round = 0; round < 5 && logs.getCount() == 0; round++) {
+        // note: we skip over the first; we want the first to remain non-busy
+        for (int i = 1; i < transientCoreNames.length; i++) {
+          solr.query(transientCoreNames[i], params("q", "*:*"));
+        }
+      }
+      // Show that the cache logs that it was asked to evict but did not.
+      // To show the bug behavior, comment this out and also comment out the 
corresponding logic
+      // that fixes it at the spot this message is logged.
+      assertTrue(logs.getCount() > 0);

Review comment:
       Where would this test fail if we uncommented the logic that fixes the 
problem?

##########
File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java
##########
@@ -2284,8 +2285,10 @@ public void run() {
 
       SolrCore core;
       while (!container.isShutDown() && (core = solrCores.getCoreToClose()) != 
null) {
+        assert core.getOpenCount() == 1;
         try {
-          core.close();
+          MDCLoggingContext.setCore(core);

Review comment:
       +1

##########
File path: 
solr/core/src/java/org/apache/solr/core/TransientSolrCoreCacheDefault.java
##########
@@ -89,6 +86,25 @@ public TransientSolrCoreCacheDefault(CoreContainer 
coreContainer) {
     transientDescriptors = new LinkedHashMap<>(initialCapacity);
   }
 
+  private void onEvict(SolrCore core) {
+    if (coreContainer.hasPendingCoreOps(core.getName())) {
+      if (log.isInfoEnabled()) {
+        log.info("NOT evicting transient core [{}]; it's loading or something 
else", core.getName());

Review comment:
       Maybe log the cache size?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to